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Preface 


This volume contains the proceedings of the two-day departmental seminar organised by the 
Department of Computer Applications (MCA) of Vidya Academy of Science & Technology 
during 22 — 23 November 2021. The seminar was the culmination of a coursework (with 
course code RLMCA 341 Seminar) to be completed by the MCA students of APJ Abdul 
Kalam Technological University during the Fifth Semester of the MCA programme. 

The syllabus of the course RLMCA 341 Seminar specifies the objective of the course as 
follows: “To enable the students to gain knowledge in any of the technically relevant current 
topics on computer science/information technology/research, and acquire the confidence in 
presenting the topic and preparing a report.” 

Again as per the syllabus, as part of the course, “each student is expected to undertake 
a detailed study on a technically relevant current topic in computer science/information 
technology under the supervision of a faculty member, by referring articles published in 
reputed journals/conference proceedings. Each student has to submit a seminar report, 
based on these papers; the report must not be reproduction of any original paper. The 
topic has to be presented taking a duration of 15 — 20 minutes. The report and slides for 
presentation has to be prepared using free typesetting software such as ATEX.” 

In Vidya Academy of Science & Technology, the supervising teachers helped the stu- 
dents to identify the areas in which the students were to work and the teachers provided the 
students with some initial learning materials in the form of papers. After the initial reading 
of these materials, the students were asked to search for additional reading materials them- 
selves. The students were required to study the papers and present a “study report/study 
paper” in a Departmental Seminar. The reports/papers collected in this volume are the 
study papers prepared by the students and presented in the Departmental Seminar. 

As part of the learning process, the students were also required to present the paper 
in the IEEE conference paper format. To facilitate this, the students were given a basic 
introduction to the ATRX software and the IEEEtran document style. 

In addition to gaining knowledge in any of the technically relevant current topics on com- 
puter science/information technology /research, the course also aimed at giving the students 
a hands on experience in preparing a conference/seminar paper. The expected learning 
outcomes of the course included acquiring a clear knowledge about the following aspects of 
preparing and presenting a high quality research paper: 


e The structure of a research paper 
e The process of literature survey 


e The accurate preparation of bibliography and their citations in the paper 


The IEEE format for the preparation of conference/journal papers 


e The concepts of “Abstracts”, “Keywords”, and the like 


e The methodology of presenting a multi-author paper in a seminar/conference. 


The articles compiled in this Proceedings are not even moderately edited. The editors 
have only ensured that the basic learning outcomes outlined above have been met. However, 
the editors have tried to ensure that the titles of chapters, sections, etc., the abstract, 
figure and table captions, and the like are as per IEEE guidelines. The references have not 
been checked for accuracy and completion. The papers have not been edited for grammar, 
punctuation, spelling or style.! 

The present work is only a record of the activities of the course referred to above and 
it is prepared only for private circulation. To the best of our understanding the authors of 
the papers have given proper attribution to ideas and material presented in the papers. If 
there are no attributions or improper attributions, it was unintentional. Hence the contents 
have not been subjected to plagiarism tests. 

It is believed that the teachers as well the students have greatly enjoyed doing the seminar 
course. There are still much scope for improvement. It is our hope that the future batches 
of students will have a stronger and wider learning experience from similar seminar courses. 


January 2022 Editors 


lFor different models of editing, see, for example “IEEE Editorial Style manual”, [Online] Available: 
https: //www.ieee.org/documents/style_manual.pdf (April 2017). 
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An Overview of Robotics 


Anu Job, Bushara C 
and Jelena Babu P 
Vidya Academy of Science & Technology 
Thrissur - 680501, Kerala 


Abstract—Robotics is an branch of computer science and 
engineering. Robotics deals with the design, construction, oper- 
ation, and use of robots and computer systems for their control, 
sensory feedback, and information processing. A robot is a unit 
that develops an interaction with the physical world based on 
sensors, actuators, and information processing. Cloud robotics is 
a field of robotics that used to invoke cloud technologies such as 
cloud computing, cloud storage, and other Internet technologies . 
Artificial intelligence (AI) is a branch of computer science. Most 
AI programs are not used to control robots,Even when AI is used 
to control robots, the AI algorithms are only part of the larger 
robotic system, which also includes sensors, actuators, and non- 
AI programming. In this paper, we are discussing about robots 
based on different time period(past, present and future).. 

Index Terms—Cloud Robotics, Artificial intelligence, Cloud 
Technologies 


I. INTRODUCTION 


1950’s by George C Devol, an inventor from Louisville, 

Kentucky .He invented and patented a reprogrammable 
manipulator called “Unimate” from “Universal Automation”. 
For the next decade, he attempted to sell his product in the 
industry, but did not succeed. In the late 1960s, Joseph Engle- 
berger bought Devol’s robot patent and was able to modify it 
into an industrial robot and form a company called Unimation 
to produce and market the robots. For his efforts and successes 
Engleberger was known as "the Father of Robotics.” 

Currently, the field of robotics is transforming and evolving 
at a very fast. Contemporary robots are currently being used 
to do the jobs which were dangerous, dirty or boring. From 
painting to welding, robots are being used in industries like 
automobile at a very large scale. According to World Robotics, 
the estimated number of robots installed worldwide is around 
50 million .But we have now started using robots for more 
than just industrial purposes. Now, robots are being used to 
mop the floor, for making burgers, and even for acting as an 
emotional support for lonely people. All that we have dreamt 
about robots a decade ago has slowly become a reality. 

In future, robots are likely to handle handle all activities. 
According to a Forrester report, robots will eliminate 6 percent 
of all jobs in the U.S. by 2021. McKinsey’s assessment is 
even more expansive. They believe that by 2030 one-third of 
American jobs could become replaced by robots. The studies 


Tr earliest robots as we know them were created early 
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Fig. 1. Image of Al-Jazari’s humanoid robots 


says that the industrial robots is expected to grow by 175% 
the next nine years which will result in more competition 
and innovation, which will drive these modern technologies 
forward. 


II. HISTORY OF ROBOTICS 


From the time of ancient civilization, there have been 
many accounts of user-configurable automated devices and 
even automata resembling humans and other animals, designed 
primarily as entertainment. 

There is the figure of Al-Jazaris humanoid robots which is 
a musical automation, which was a boat with four automatic 
musicians that floated on a lake to entertain guests at royal 
parties (see Figure 1). The Egyptian water clock is one of 
the very first cases of ’robotics” in human history. The oldest 
example of the water clock, found in the tomb of Amenhotep. 

As mechanical techniques developed through the Industrial 
age, there appeared more practical applications such as auto- 
mated machines, remote-control and wireless remote-control. 
The first uses of modern robots were in factories as industrial 
robots. These industrial robots were fixed machines capable 
of doing tasks which allowed production with less human 
work. Digitally programmed industrial robots with artificial 
intelligence have been built since the 2000s. There is only 
one condition in which we can imagine managers not needing 
subordinates, and masters not needing slaves. This condition 
would be that each instrument could do its own work. 

Japan has long been recognized as the robotics powerhouse. 
Robotic automation enables the operation of production lines 
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with less manpower as such, industrial robots have become 
widely used in the automotive industry to complete various 
tasks like hauling, welding and painting, while at the same 
time liberating humans from working in harsh conditions The 
history of robots has its origins in the ancient world. During 
the industrial revolution, humans developed the structural 
engineering capability to control electricity so that machines 
could be powered with small motors. In the early 20th century, 
the notion of a humanoid machine was developed. 


Fig. 2. A musical instrument playing robot 


Figure 3 shows a graphic representation of industrial robots 
in 2016, according to the world robotics. From the graph we 
can understand that number of industrial robots installed per 
10,000 employees. 


Japan i 303 


Korea i] 631 


Singapore E 488 


Proportion representing the number of robots 


Germany, | 203 yet to be installed amounts to 94% - 99% 


United States i 189 
China | 68 
Thailand | 45 


2,500 5,000 
(Number of employees) 


E Number of robots installed 


7,500 10,000 


Not yet installed 


Fig. 3. The number of industrial robots installed per 10,000 employees in 
the manufacturing industry 2016 


There were also obstacle in robot installations was a lack 
of flexibility in production line with robots. In order to secure 
the safety of the workers, the robots were unable to work 
in the same area as humans and hence required their own 
designated spaces. Due to the strict operational restrictions, 
such as installing a safety fence between robots and humans, 
the construction of a more versatile production line posed a 
challenge. There was also the issue in terms of usability. To 
operate industrial robots, they usually require a programming 


Fig. 4. Kitchen robot 


process which teaches the motion sequence to robots. This 
process requires an engineer with expert knowledge, and the 
programming itself is an excessive workload. change within a 
short span. 


III. CURRENT STATUS OF ROBOTICS 


Today, we can see the usage of robots very widely, In 
this current situation all the automatic machine are known 
as robots. Because in the science fiction literature a robot is 
an artificial machine that have the capabilities that similar to 
the humans, in the real word the robot term used to refer the 
simple machines also. Example for the above are multipurpose 
kitchen machine that is also known as “Kitchen Robot” (see 
Figure 4). Same as in the industry, a controlled articulated 
mechanical system is also called robots. A mobile robot is an 
intelligent autonomous vehicle. For that the term robot will 
become a general term that always used for any automatic 
machine. 


Fig. 5. Image of a humanoid robot 


A. Definition of a Robot 


There have been multiple attempts to redefine the term 
robot. 
e In industry a robot is an “automatically controlled, re- 
programmable multipurpose manipulator programmable 
in three or more axes” (ISO definition). 
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e In relation with web search engines a robot is an “au- 
tomated program that follows links to visit web sites on 
behalf of search engines or directories”. 


There are many reason to explain the above changes in 
the meaning of the word “robots” but the main one just 
about marketing. The robot term is used in the commercial 
marketing to give a product indication of something innovative 
and experienced(eg:’kitchen Robot’). But this over marketing 
will produce some negative type of effects to those who are 
still researching in robotics: 


e Many people think that robotics is a grown up technology. 
e The other people thinks that researchers in robotics are 
not truthful. 


In some industrial sectors like the construction, agricul- 
ture, cleaning, etc. generated over expectation in the robotic 
application. This relative analyse leads to the sectors not 
believing in robotics.By adopting the latest technology like 
image recognition, it permits the robot to clearly understand, 
grasp and transport a given object that to work even if that 
object is anywise located. 

duAro is an industrial robot developed by the Kawasaki 
Heavy Industries. It also known as the Dual-arm SCARA 
robot. Food, cosmetics, pharmaceutical industries are also 
looking to implement more robots in their field. 


IV. FUTURE OF ROBOTICS 


In the above area we discuss about the early history and the 
current situation of robots, in that we should understand that 
the robotics have a great future too. 

Now a days robotics is not just used in the industrial area 
it is widely spread in our daily routine in the future. For 
example SoftBank’s communication robot, “Pepper” has been 
developed not only for domestic use but also for businesses 
and can be observed serving customers at electronics retail 
stores or in hotels. Then the other one is the development of 
“aibo” developed by the sony, one of the leading Japanese 
company consumer electronics manufacturers, they will sell 
“aibo” as the pet robot.Now a days many companies are 
developed and producing robots widely, not only by the big 
manufactures like sony,but also the start up sectors are also 
done the developing of robots. 

Many newly developed robots are observed to resemble 
the shapes of animals and peoples. Many of these robots 
are seen to put an point up on engaging in communication 
with people like the robots we mentioned above “pepper” and 
“aibo”.A human shaped robots have the capability to behave 
like human and they are also familiar to beings, For example , 
the environment in which we live today-the width of a passage, 
the steps of staircase, the position of a doorknob, etc. If a 
robot is the same size and able to move in the same way 
as humans, it will be capable of performing tasks within the 
living environments of people .This idea ultimately leads to 


enable the robots to work on behalf of humans in dangerous 
environments. 

With the dramatic evolution of elemental technologies that 
constitute robots, such as image, speech, or space recognition, 
AI learning, sensors and actuators, preparations are underway 
for robots to be actively utilized in a broader field. The new 
humanoid robot by Kawasaki becoming a common existence 
in households all over the world and assisting the humans in 
the various scenes in everyday life, such as cooking, laundry, 
and cleaning — a future like this may not be too far away. 


V. CONCLUSION 


Today we find most robots working for people in industries, 
factories, warehouses, and laboratories. Robots are useful in 
many ways. For instance, it boosts economy because busi- 
nesses need to be efficient to keep up with the industry 
competition. Therefore, having robots helps business owners to 
be competitive, because robots can do jobs better and faster 
than humans can, e.g. robot can built, assemble a car. Yet 
robots cannot perform every job; today robots roles include 
assisting research and industry. Finally, as the technology 
improves, there will be new ways to use robots which will 
bring new hopes and new potentials. 
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Virtualization in Cloud Computing 


Anju Davis, Ganga Rajeev 
and Nileena C U 
Vidya Academy of Science & Technology 
Thrissur - 680501, Kerala 


Abstract—Cloud computing is a modern technology that in- 
crease application potentialities in terms of functioning, elastic 
resource management and collaborative execution approach. 
The central part of cloud computing is virtualization which 
enables industry or academic IT resources through on-demand 
allocation dynamically. This paper focuses on different types of 
virtualization and virtualization techniques, tools and feature 
research direction. Virtualization to allow large expensive main- 
frames to be easily shared among different application within 
the environments. As hardware cost went down, the need for 
virtualization decades. More recently, virtualization at all levels 
(system, storage, and network) became important again as a way 
to improve system security, reliability and availability, reduce 
costs, and provide moreover advantages. The virtualization pro- 
cess and implementation with their advantages are documented 
and the different types of virtualizations tools. This paper consists 
of two sections arranged in the following way Section1: Provides a 
brief explanation about Virtualization, Virtualization for cloud, 
Types of Virtualization, section 2: Virtualization tools and its 
comparison. 

Index Terms—Cloud computing, virtualization, virtual ma- 
chine monitor hypervisor, emulation, and VM ware 


I. INTRODUCTION 


First cloud computing isn’t network computing. With net- 
work computing, application or documents are hosted on a 
single company’s server and accessed over the company’s 
network. Cloud computing is a lot bigger than that. It en- 
compasses multiple companies, multiple servers, and multiple 
networks. Plus, unlike network computing, cloud services and 
storage are accessible from anywhere in the world over an 
Internet connection; with network computing, access is over 
the company’s network only. Cloud computing also isn’t tradi- 
tional outsourcing, where a company farms out (subcontracts) 
its computing services to an outside firm. While an outsourcing 
firm might host a company’s data or applications, those 
documents and programs are only accessible to the company’s 
employees via the company’s network, not to the entire world 
via the Internet. So, despite superficial similarities, networking 
computing and outsourcing are not cloud computing. 

The technology is enhanced rapidly day by day. With the 
expansion of the computer system, the virtual machine has 
been originated to be the major research topic by researchers. 
By utilizing the virtual machinery, the computer scheme can 
combine all forms of data assets or resources, software assets, 
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Fig. 1. Virtualisation in cloud computing 


and hardware assets. Also, virtualization can make these 
assets to afford facility for diverse tasks. This technology also 
distributed hardware and software management and afforded 
valuable features containing performance separation, Server 
unification, and live migration. Additionally, virtualization can 
also offer transportable environments for up-to-date computer 
schemes. Conseguently, the virtualization machinery has been 
utilized widely. 


II. DEFINITION OF CLOUD COMPUTING 


Cloud computing is the use of various services such as, 
software development platforms, servers, storage and software 
over the internet often referred to the cloud. Cloud computing 
is a way to access and share pool of configurable computing 
resources over internet such as networks, servers, storage, 
applications, and services with minimal management effort 
or service provider interaction. There are many benefits and 
characteristics of the cloud computing. 

e The backend of the application (especially hardware) is 

completely managed by cloud computing. 

e A user only pays for services used (memory, processing 

time and bandwidth). 

e Services are scalable. 


III. VIRTUALIZATION 


Virtualization is a technique which allows sharing physical 
instances of an application or resources among multiple orga- 
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TABLE I 
DIFFERENCE BETWEEN CLOUD COMPUTING AND VIRTUALIZATION 


Virtualization Cloud 
Definition Technology Methodology 
Create multiple simulated environ- A 
č Pool and automate virtual re- 
Purpose ments from 1 physical hardware 
sources for on-demand use 
system 
; Deliver variable resources to 
Deliver packaged resources to spe- . 
Use f : groups of users for a variety of 
cific users for a specific purpose 
purposes 
Configuration Image-based Template-based 
Lifespan Years (long-term) Hours to months (short-term) 
High capital expenditures | Private cloud: High CAPEX, low 
Cost (CAPEX), low operating expenses | OPEX Public cloud: Low CAPEX, 
(OPEX) high OPEX 
Scalability Scale up Scale out 
Workload Stateful Stateless 
Tenancy Single tenant Multiple tenants 


nizations or users. Majority of cloud-based systems combine 
their resources into pools that can be assigned on-demand to 
users. In cloud computing pooled resources can be assigned 
using a technique is called virtualization. Virtualization assigns 
a logical name for a physical resource and then provides 
a pointer to that physical resource when a requested by 
user. Virtualization provides a means to manage resources 
efficiently because the mapping of virtual resources to physical 
resources can be both dynamic and effortless. The machine 
on which the virtual machine is created is called as host 
machine. The virtual machine is the guest machine. The virtual 
machine is managed by a software or firmware, which is 
known as hypervisor. These are among the different types of 
virtualization that are characteristic of cloud computing: 


e Access: 
A client can request access to a cloud service from any 
location. 

e Application: 
A cloud has multiple application instances and directs 
requests to an instance based on conditions. 

e CPU: 
Computers can be partitioned into a set of virtual ma- 
chines with each machine being assigned a workload. 
Alternatively, systems can be virtualized through load- 
balancing technologies. 

e Storage: 
Data is stored across storage devices and often replicated 
for redundancy. 


IV. ROLE OF VIRTUALIZATION IN CLOUD COMPUTING 


Virtualization is the backbone of Cloud Computing Cloud 
Computing brings efficient benefits as well as makes it more 
convenient with the help of Virtualization, not only this, it 
also provides solutions for great challenges in the field of data 
security and privacy protection. Virtualization is the imitation 


of hardware within a software program. A Single computer 
is allowed to perform the role of multiple computers. In a 
web server or a file, the usage of purchase, maintenance, 
depreciation, energy and floor space is double, but by creating 
virtual web or file server all of our objectives are fulfilled 
like improvement in security, the use of hardware resources 
to its maximum, flexibility, and reduced cost. Benefits of 
virtualization include Efficient use of resources, increased 
security, portability, problem free testing, easier manageability, 
increased flexibility, fault isolation, rapid deployment etc. 

A server or a central computer hosting an application for 
multiple users, thereby preventing the need for separately 
installing software on every machine is virtualization in Cloud 
Computing. The information from different databases, hard 
drives, and USB drives are merged into one location thereby 
increasing its accessibility and security. Virtualization in cloud 
computing refers to the creation of virtual hardware, software, 
or an operating system, a storage or network device. Virtual 
changes occur more rapidly rather than physical changes in 
IT environment. The changes occurring has to be managed, 
such changes are scalable and agile because of virtualization 
in Cloud Computing. 


V. IMPORTANCE OF VIRTUALIZATION 


For the maintenance of resources in cloud computing en- 
vironment, virtualization is a necessity as it makes it easier. 
Virtualization in Cloud Computing lets increase in security by 
protecting both the integrity of cloud components and guest 
virtual machines. Cloud Component virtualized machines can 
also be scaled up or down on demand or can provide reliability. 
High utilization of pooled resources, resource Sharing and 
rapid provisioning are also some of the factors Managed 
Service Provider VA provides. 


e Simplified Management 
e Reduced system administrative work 
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e Resource optimization 

e Saves money 

e Easier software installation 

e Data centre consolidation 

e Decreased power Consumption 


VI. TYPES OF VIRTUALIZATION 
A. Data Virtualization 


Data stands for that a user can have the access to the same 
data from different physical positions. In data virtualization the 
data is moved to a server, mapped to its actual location, and 
allow the user to access it. This makes possible to scroll by 
the data as if reading a webpage, without bringing it directly 
on user’s computer or another server. 


B. Application Virtualization 


Basically, virtualizing is a technique that differentiates the 
operating system and the application. The server executes the 
application on or any other system instead on the system which 
using it. The main benefit of application virtualization is a 
user in enabled to run unsuited application in parallel form. 
Those applications can also be run which are not made for 
the operating system of the computer from which these are 
accessed. 

Top application virtualization tools: 


e Parallel remote application software 
e VMware ThinApp 
e Microsoft App-V 


C. Network Virtualization 


The way to combine useable resources in the network by 
dividing availed channel’s bandwidth, each channel does not 
depends on others, plus it is possible to redistribute each 
of them to a particular device or server in actual time is 
known as network virtualization. The hint is virtualization 
masking actual complexity of the system by splitting the 
complex system into accomplishable parts, just like hard drive 
partitioning which make easy to store files. 


D. Server Virtualization 


The hiding of server resources from server users is known 
as server virtualization. We use server virtualization to free 
the client to understand and accomplish difficult details about 
resources of server when sharing and usage of resource 
increased and keeping the capacity to further increase. The 
sharing of storage from multiple network storage devices into 
a single storage device, a central console handle it is called 
storage virtualization. Generally storage area networks use 
storage virtualization. 


E. Operating System Virtualization 


The other name of operating system virtualization is 
container-based virtualization, in which same operating system 
is used on a server but hacks it up into components. Each 
virtual environment has its special set of rules and access with 
the one exception that it all must have compatibility with the 


same operating system. A simple example of OS virtualization 
is Open VPN. 


F. Para-virtualization 


In para-virtualization operating system virtualization and 
hardware virtualization are combined. An operating system 
running on the server either accesses the virtualization soft- 
ware to execute or directly access the hardware. This double 
access offers a para-virtualization model better variety to use 
available resources and maximize the operability of device. 
Xen platform is an example of open source paravirtualization 
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Fig. 2. Types of virtualization 


VII. MAJOR VIRTUALIZATION TECHNIQUES USED IN 
CLOUD COMPUTING 


The following are the main virtualization techniques that 
are currently in use: 


A. Binary Translation and Native Execution 


This technique uses a combination of binary translation 
for handling privileged and sensitive instructions, and Direct 
execution techniques for user-level instructions. This technique 
is very efficient both in terms of performance and in terms of 
compatibility with the guestOS 


B. OS Assisted Virtualization (Paravirtualization) 


In this technique, the guest OS is modified to be 
virtualization-aware (allow it to communicate through hyper 
calls with the hypervisor, so as to handle privileged and 
sensitive instructions). 


C. Hardware-assisted Virtualization 


As an alternative approach to binary translation and in an 
attempt to enhance performance and compatibility. 
e Hardware providers (e.g., Intel and AMD) started sup- 
porting virtualization at the hardware level. In hardware- 
assisted virtualization (e.g., Intel VT-) 


VIII. COMPARISON BETWEEN VIRTUALIZATION TYPES 


A. Desktop Virtualization vs Application Virtualisation 


Table II gives a comparison between desktop virtualisation 
and application virtualisation. 


Anju Davis et al., “Virtualization in Cloud Computing” 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


TABLE II 
DESKTOP VIRTUALIZATION VS APPLICATION VIRTUALISATION 


Desktop virtualization 


Application virtualization 


Offers greater flexibility to virtual infras- 
tructure 


A lesser level of flexibility in comparison 


A richer and consistent desktop experience 


The desktop experience differs from appli- 
cation to application 


Maintenance of applications is difficult as 
simple changes require redeployment of the 
golden image to all VDI instances 


Allows easier maintenance of apllica- 
tions,allowing change to take place without 
the user even noticing that changes have 
taken place 


Cost can be concern depending on the use 
case 


Cost-effective solution 


Apllications are still tied into the underlying 
os 


Fully isolates the apllication from the un- 
derlying Os 


Gives users the experience of a complete 
desktop 


Gives users theexperience individualized to 
make it application-specific 


Complete virtualization causes more impact 
on the underlying hardware 


Application virtualization transfers less 
data,thus lowering the impact on the hard- 
ware 


Acess and authentication management is 
comparatively difficult 


Acess and authentication management is 
comparatively easier 


B. Software vs Hardware Virtualization 


e In case of software virtualization, the host system needs 
to completely emulate guest’s platform (i.e. ranging from 
hardware, CPU instructions, through its firmware and 
even the operating system /if there is one/). The advantage 
is that host and guest platforms are independent (our 
example of the Nintendo emulator). The disadvantage is 
that this approach is very slow and resource consuming 
(since we have to emulate everything). 

e Hardware (-assisted) virtualization provides a significant 
performance gain over software virtualization by running 
some guest code directly on the host hardware with 
limited or none assistance from the host system - the hard- 
ware however needs to support this (search for Intel VT or 
AMD-V).The disadvantage over software virtualization 
is that the guest and host systems need to use the same 
platform (i.e. you cannot use hardware virtualization for 
our Nintendo example). 


C. Desktop Virtualization vs Server Virtualization 


e Server virtualization does not add any additional load to 
the network; desktop virtualization operates entirely on 
the network, which can slow down production speeds. 
Desktop virtualization requires a company to make more 
changes in their IT resources. To properly enable desktop 
virtualization it will affect the data centre, network and 
transmission protocol. Server virtualization only requires 
changes to be made to the server. 

e Both desktop virtualization and server virtualization can 
help cut costs while making data easily available to 


employees. If a company is considering desktop virtual- 
ization or server virtualization they must fully understand 
the difference between the two. For a smooth transition 
a company must plan out their move to desktop virtual- 
ization, server virtualization or both. 


IX. DIFFERENT KINDS OF SERVER VIRTUALIZATION 
A. Full Virtualization 


Full virtualization uses a hypervisor, a type of software that 
directly communicates with a physical server’s disk space and 
CPU. The hypervisor monitors the physical server’s resources 
and keeps each virtual server independent and unaware of the 
other virtual servers. It also relays resources from the physical 
server to the correct virtual server as it runs applications. 
The biggest limitation of using full virtualization is that a 
hypervisor has its own processing needs. This can slow down 
applications and impact server performance. 


B. Para-Virtualization 


Unlike full virtualization, para-virtualization involves the 
entire network working together as a cohesive unit. Since each 
operating system on the virtual servers is aware of one another 
in para-virtualization, the hypervisor does not need to use as 
much processing power to manage the operating systems. 


C. OS-Level Virtualization 


Unlike full and para-virtualization, OS-level visualization 
does not use a hypervisor. Instead, the virtualization capa- 
bility, which is part of the physical server operating system, 
performs all the tasks of a hypervisor. However, all the virtual 
servers must run that same operating system in this server 
virtualization method. 
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TABLE III 
COMPARING SERVER VIRTUALIZATION ARCHITECTURE 
Host-based server virtualization 
Full Para Hardware assisted OS virtualization 
Common role Legacy server, training Production servers that | Production servers High performance web 
run on paravirtualized OS and database servers that 
require full isolation and 
high consolidation ratios 
Limitations Reduced perfomance due | Limited OS Requires server hardware | No support for legacy op- 
to higher virtualization that support Intel VT erating system 
overload 
Isolation Each VM run its own as Provides the same isola- | improved over full and | Isolation achieved by run- 
tion as full virtualization para virtualization ning each VE as an appli- 
cation on an share OS 
Pefomance No noticable degradation | Good on CPU degradation | Better no CPU degrada- | Best on CPU network disk 
in CPU tion overhead 
Management Point level tools available | same as full virtualization | same as full virtualization | Few integration option 
by each vendor for man- available for enterprise 
agement and monitoring management tool due low 
of VMs and physical host 
Patching Distributed enterprise | same as full virtualization | Same as full virtualization | centralized patching for 
patch management the host and all virtual en- 
software required for vironment 
simplify management 


X. HOW SERVER VIRTUALIZATION WORKS? 


In server virtualization, the virtual servers are dedicated 
only to a particular task for their betterment in performance. 
Every virtual server performs like a distinctive physical device 
that is capable of running its own Operating System. Also, 
software specially designed for this purpose is used. The 
administrator of the software can convert one physical server 
into multiple virtual servers. Further, these multiple servers are 
enough to use all the machines’ processing power. CPU of the 
computer works with multiple processors which provide the 
ability to run many complicated tasks with ease. Lucid is the 
basic principle of working of the server virtualization. Each 
virtual server performs like a unique physical device, which is 
capable to run its own operating system. Here software which 
is specially designed for this purpose is used. section Why 
Server Virtualization? Server virtualization is a cost-effective 
method that allows using resources efficiently and provides 
web hosting services effectively utilizing existing resources of 
IT infrastructure. Without Server Virtualization, many servers 
use only a small part of their overall capacity. Therefore, the 
process of dividing one physical layer into multiple virtual 
layers acts like a physical server and thus increases the 
capacity of each physical machine and reduces the major cost 
of hardware. This server virtualization in cloud computing 
divides the volume of the work into multiple servers, and all 
these virtual servers are capable of performing a particular 
task. Any individual can reduce the workload between virtual 
machines according to the load. Server Virtualization helps to 
address issues at a time which is done by specially designed 


software, or an administrator that can convert a single physical 
server into virtual machines. 


A. Server Virtualization Software Features and Capabilities 


Server virtualization software provides these common fea- 
tures: Type 1 or type 2 hypervisor 


e Run multiple virtual machines using different OS on same 
server. 

e Automated virtual machine provisioning 

e Manage remote physical locations, branch locations with 
rapid provisioning 


B. Benefits of Server Virtualization 


1) Server consolidation 
Because virtualization enables one physical server to do 
the work of several servers, the total number of servers in 
the enterprise can be reduced. It’s a process called server 
consolidation. For example, suppose there are currently 
12 physical servers, each running a single application. 
With the introduction of virtualization, each physical 
server might host three VMs, with each VM running an 
application. Then, the organization would only require 
four physical servers to run the same 12 workloads. 

2) Simplified physical infrastructure 
With fewer servers, the number of racks and cables in 
the data center is dramatically reduced. This simplifies 
deployments and troubleshooting. The organization can 
accomplish the same computing goals with just a fraction 
of the space, power and cooling required for the physical 
server complement. 


Anju Davis et al., “Virtualization in Cloud Computing” 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


3) Reduced hardware and facilities costs 
Server consolidation lowers the cost of data center hard- 
ware as well as facilities — remember, less power and 
cooling. Server consolidation through virtualization is a 
significant cost-saving tactic for organizations with large 
server counts. 

4) Greater server versatility 
Because every VM exists as its own independent instance, 
every VM must run an independent OS. However, the 
OS can vary between VMs, enabling the organization to 
deploy any desired mix of Windows, Linux and other 
OSs on the same physical hardware. Such flexibility is 
unmatched in traditional physical server deployments. 


C. Server Virtualization Disadvantages 


1) If any server goes offline, then the websites hosted by it 
will also go down hence, to avoid this, the company must 
set up a cluster of servers 

2) Virtual Machines must be managed accurately by config- 
uring and monitoring the necessary actions. 

3) The methods of measuring the performance of virtualized 
environments are not yet distinct. 

4) The requirement of RAM is more since RAM consump- 
tion gets increased as each Virtual Machine will occupy 
its separate area. The requirement for disk space also gets 
increased due to the files in each Virtual Machine. 


XI. VIRTUAL MACHINE TYPES 


A low level program that enables the virtual machine 
to access the physical machine. This program is known as 
Hypervisor or Virtual Machine Monitor(VMM). There are two 
types of Hypervisors, namely Type 1 Hypervisor and Type2 
Hypervisor. 


A. Type 1 Hypervisor 


This is also known as Bare Metal or Embedded or Native 
Hypervisor. It works directly on the hardware of the host and 
can monitor operating systems that run above the hypervisor. 
It is completely independent from the Operating System. The 
hypervisor is small as its main task is sharing and managing 
hardware resources between different operating systems. A 
major advantage is that any problems in one virtual machine or 
guest operating system do not affect the other guest operating 
systems running on the hypervisor. 


Guest OS 


auseros | | Sanda 


Hypervisor 


| System Hardware | 


Type 1 Hypervisor 


Fig. 3. Type 1 Hypervisor 


B. Type 2 Hypervisor 


This is also known as Hosted Hypervisor. In this case, the 
hypervisor is installed on an operating system. Guest oper- 
ating system can be installed on the hypervisor. Hypervisor 
is completely dependent on host Operating System for its 
operations. While having a base operating system (host) allows 
better specification of policies. Any problems in the base (host) 
operating system affect the entire system as well even if the 
hypervisor running above the base OS is secure. 


| Guest OS | | Guest OS | | Guest OS ] 


| Host Operating System | 


| System Hardware | 


Type 2 Hypervisor 


Fig. 4. Type 2 Hypervisor 


XII. VIRTUAL MACHINE MONITOR 


A software layer that can monitor and virtualizes the re- 
sources of a host machine conferring to the user requirements. 
It is an intermediate layer between operating system and 
hardware. Basically, hypervisor is classified as native and 
hosted. The native based hypervisor runs directly on the 
hardware whereas host based hypervisor runs on the host 
operating system. The software layer creates virtual resources 
such as CPU, memory, storage and drivers. 


A. Emulation 


It is a virtualization technique which converts the behaviour 
of the computer hardware to a software program and lies in the 
operating system layer which lies on the hardware. Emulation 
provides enormous flexibility to guest operating system but the 
speed of translation process is low compared to hypervisor and 
requires a high configuration of hardware resources to run the 
software. 


B. VMware vSphere 


VMware vSphere is a management infrastructure framework 
that virtualizes system, storage, and networking hardware to 
create cloud computing infrastructures. vSphere is the brand- 
ing for a set of management tools and a set of products pre- 
viously labelled as VMware Infrastructure.vSphere provides 
a set of services that applications can use to access cloud 
resources, some of the services are: 


e VMware vCompute 
A service that aggregates servers into an assignablepool 
e VMware vStorage 
A service that aggregates storage resources into an 
assignablepool 
e VMware vNetwork 
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A service that creates and manages virtual network inter- 
faces 

e Application services 
Such as HA (High Availability) and Fault Tolerance 

e vCenter Server 
A provisioning, management, and monitoring console for 
VMware cloud infrastructure 


— 


Coss 


Fig. 5. Vmware and Vsphere 


A vSphere cloud is a purely an IaaS. The virtualization 
layer that abstracts processing, memory, and storage use the 
VMware ESX or ESXi virtualization server. ESX is a Type 1 
hypervisor; it installs over bare metal (a clean system) using 
a Linux kernel to boot and installs the vmkernel hypervisor 
(virtualization kernel and support files). When the system is 
rebooted, the vmkernel loads first, and then the Linux kernel 
becomes the first guest operating system to run as a virtual 
machine on the system and contains the service console. 
VMware is a very highly developed infrastructure and the 
current leader in this industry. There are many add-on products 
are available for cloud computing applications. Some of the 
products are: 

1) Virtual Machine File System (VMFS) 

A high-performance cluster files system for an 
ESX/ES Xicluster 
2) Distributed Resource Scheduler (DRS) 
A system for provisioning virtualmachines and load 
balancing processing resources dynamically across the 
different physical systems that are in use. A part of the 
DRS called the distributed power management (DPM) 
module can manage the power consumption of systems. 

3) vNetwork Distributed Switch (DVS) 

A capability to maintain a network runtime state for 
virtual machines as they are migrated from one physical 
system to another. DVS also monitors network connec- 
tions, provides firewall services, and enables the use of 
third party switches to manage virtual networks. 

Physical computers can be standalone hosts or a set of clus- 
tered systems. A set of virtual machines can be created that is 
part of a single physical system or spans two or more physical 
systems. A group of VMs can be defined as a Resource Pool 
(RP) and, manage those virtual machines as a single object 
with a single policy. As more hosts or cluster nodes are added 
or removed, vSphere can dynamically adjust the provisioning 
of VMs to accommodate the policy in place. This fine tuning 


of pooled resources is required to accommodate the needs of 
cloud computing networks. The datastore shown at the center 
of Figure is a shared storage resource. These storage resources 
can be either Direct Attached Storage (DAS) of a server or 
Network Attached Storage (NAS) disk arrays. 
The key features of virtual infrastructure are: 
e Flexibility in implementing 
e Creating a virtual machine is a very fast process, typically 
only a few seconds in length. 
e Machine images or snapshot of virtual machine can be 
taken. These images can be brought on-line as needed. 


XIII. VIRTUALIZATION TOOLS 


For a comparison of virtualisation tools, see Table IV 


1) Virtual Network User Mode Linux (VNUML): 
is an open source and is available to all the users for free 
download. VNUML is basically a virtualization tool used 
for multiple virtual systems of Linux operating system. 
These virtual systems are known as guests which run their 
applications along with Linux operating system of the 
original system which is refer to as host. 

2) VirtualBox 
Virtual Box is used for implementation of virtual ma- 
chines on the physical computers and servers. It also 
does full virtualization in the host computer which means 
that without any modification in the operating system the 
guest operating system is executed on the host computer. 

3) VMwareServer 
It is a source free virtualization tool for Linux as well as 
Windows operating system . VMware Server is based on 
the full virtualization i.e., the physical desktop computer 
to run more than one virtual machine of varying operating 
system called guests on it. 

4) EMFTool 
EMF virtualization tool is an eclipse based plug in on 
EMF basis to hold the transparent usage of virtual models 
all of which are based on EMF. For the creation of a 
virtual model using the EMF tool, the users have to 
provide contributing models along with Meta models for 
the virtualization. Following three elements are the basics 
of any virtual model formed by EMF tool. 


e Composition Metamodel: It is used for the specifica- 
tion of virtual model concepts. The user may define 
it or it can be the amalgamation of various separate 
composition processes. 

e CorrespondenceModel: It is mostly defined along 
with the AMW2tool. This correspondence model 
contains all virtual links which are related in the 
contributing elements and identify in which manner 
they are to be composed. 

e VirtualModel: It is a file which specifies the physical 
location of all hardware resources which are to be 
used in the virtual composition process. 


5) Virtual EMF: Virtual EMF is virtualization model com- 
position tool. The specification of this tool is that it 
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TABLE IV 


COMPARISON OF VIRTUALIZATION TOOLS 


Virtualization tool Availability purpose mode of virtualization 
VMWare Commercial Gives better product for managing | full virtualization 
virtual infrastructure 
Xen Opensource for virtual machine(VM)migration | paravirtualization 
QEMU for hetrogenous range of hardware | opensource Native virtualization 
architectures.May be used as emu- 
lator 
VirtualBox commercial Commercial version to support re- | Native Virtualization 
mote desktop protocol 
VMWork station Opensource Run under opensource operating | Full virtualization 
system(OS) 
VMware Vcenter Con- | opensource Run under opensource operating | Full virtualization 
verter system(OS) 
VMwareServer currently free (not opensource) runs both on Windows and Linux | Full virtualization 
platforms 
KVM (Kernel-Based- | Opensource For Linux servers CPU support for | Full virtulization 
VirtualMachine) virtualization 
allows overcoming the limitations of virtual models such [3] S.perezMobile cloud computing, [Online]. Available: 
as the virtual models are unable to support concrete ; AA are care eee 
data although they are easily accessed. They help in DI Seen Pee ee ee ee . 
z x ae : $ [5] Karen Scarfone,Murugiah Souppaya, ” Guide to security for full virtual- 
manipulating the original data contained in other models ization technologies” , Vol 800-125,special publication, 2011 
of EMF, so this tool is also built on Eclipse/EMF1. [6] IBM virtual infrastructure as a service product.[Online]. Available: 
http://www.935.IBM.com. 
XIV. CONCLUSION [7] M.cafaro and G.Alosio. “Grids,cloud,virtualization” 
Cloud computing is a newly developing paradigm of dis- [8] A.Menon,A.L.Cox and W.Zwaenepoeloptimizing network virtualization 
Seated ; Vi lization’ binati ith utili in Xen, LABOS-CONF-2006-003,2006 
tribute computing. irtua ization in com eo with utility [9] Virtual Box. Retrieved JAN 2014, 2014,  frovirtm 
computing model can make a difference in the IT industry and http://www. virtualbox.org/S.perezMobile cloud computing 
as well as in social perspective. Mainly Virtualization means, 110) A.Whitaker, M.shaw,D.Gribble* 
- p - - - [11] G.J.Popek and R.P GoldbergFormal requirements for virtualizable third 
running multiple operating systems on a single machine but generation architectures, IIT), 412-421 july 1974 
sharing all the hardware resources. And it helps us to provide [12] VMwareServer. Retrieved JAN 2014 [Online]. Available: 
the pool of IT resources so that we can share these IT resources http://www.vmware.com/products/server/. 
in order to get benefits in the business. Virtualization tools are [13] White papper, “brief IAEPONA NOV 2010” ; 
a wide topic of research since they are used in virtualization of [14 ee WA Re ‘Breaking the performance barrier , 
hardware machines. In this paper, we discussed various types [15] VServer. Retrieved JAN 2014, 2014, [Online]. . Available: www.linux- 
of virtualization, its benefits, various tools and techniques used vserver.org/. 
in virtualization. Virtualization has a variety of applications [16] Chen PM ,and B.D. Nobel,“In providing of the eight workshop on hot 
and a few are discussed in this paper. Also the techniques and topics in operating: systems: 2001 
ji NAZI paper, q [17] I.Alam et al.cloud computing characterstics and services, Vol 53, 1- 
features of virtualization tools. 40,2020 
R [18] KVM (Kernel-based Virtual Machine). Retrieved 19th May, 2013, 
EFERENCES [Online]. Available: www.linux-kvm.org/ý.. 
[1] P.Barham,B.Dragovic, “xen and the art of virtualization” proceeding of [19] F.R.Haro “A summary of virtualization technique on Electronics Engi- 
the ninteeth ACM symposium on operating system principle neering and computer science” 
[2] Aaquib Rashid, Amith kumarcloud computing characterstics and ser- [20] A.Gordon,N.Har’Ei,N.Amit,M.Ben-YehudhaACM SIGARCH computer 


vices, Vol 7, E-ISSN:2347-2693, FEb 2019 


Architecture,40(1):411-422,2012 
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Abstract—With the emergence of the global economy, and with 
an ever-increasing percentage of consumers doing their business 
primarily via online or mobile devices, electronic commerce, e- 
commerce, is fast being regarded as the way to go global at the 
touch of a button. Hence, developing an effective e-commerce 
model is becoming vital for any modern business. However, a 
company must address different new security challenges and be 
certain to maintain the highest standards of e-commerce security, 
to protect both themselves and their customers. A failure to 
adhere to stringent e-commerce security can result in lost data, 
compromised transaction information, as well as the release of 
the customer’s financial data. This can lead to legal and financial 
liability, as well as a negative impact on the company’s reputation. 
This new security challenges are the results of the use of the 
new technology and communication medium, and the flow of 
information from enterprise to enterprise, from enterprise to 
consumers, and also within the enterprise. This paper presents 
the different technology and conceptual components of the e- 
commerce in general, and identifies and classifies the different 
types of security challenges facing e-commerce businesses in 
particular. 

Index Terms—E-Commerce security issues, challenges, risks. 
threats, security measures, secure online shopping guidelines, 
digital E-commerce. 


I. INTRODUCTION 


-COMMERCE refers to a wide range of online business 
activities for products and services for which security is 
the basic need to secure the information over internet. 
Internet becomes the comprehensive source for any type 
of business and commercial transaction through e-commerce 
website. It enables different range of business to exchange 
trading goods and services between corporations. These ser- 
vices include ‘click and buy’ methods using computers as well 
as various mobile devices or smart phones. E-commerce or 
Electronic commerce is a business model that lets firms and 
individuals buy and sell things over the internet. 
E-commerce operates in a four of the following major 
market segments and are as follows: 


1) B2B - Business to Business 
2) B2C - Business to Consumers 
3) C2C - Consumers to Consumers 
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4) C2B - Consumers to Business 


This paper portraits different security risks and challenges 
in e-commerce sector and also prescribes various solutions 
to reduce and eliminate the risks of security issues in E- 
commerce. 

E-commerce security is essential if you are to make it in this 
industry. A serious business should, therefore, employ rock- 
solid e-commerce security protocols and measures. It will keep 
the business and customers free from attacks. e-commerce 
security is the guidelines that ensure safe transaction through 
the internet. It consists of protocols that safeguard people who 
engage in online selling and buying of goods and services. 
You need to gain your customers’ trust by putting in place E- 
Commerce security basics. While growth in E-Commerce has 
improved online transactions, it has attracted the attention of 
the bad players in equal measures. E-commerce Security is a 
piece of the Information Security structure that influence web- 
based business incorporating Computer Security, Data security 
and other domains of the Information Security system. 

Today, protection, privacy and security are among the major 
concerns for electronic innovations. Commerce also shares 
security concerns with different other advancements in the 
field. There are rules for securing frameworks and systems 


Marjan Haseena T M et al., “E-Commerce: Critical Risk Factors and Key Factors for Success” 12 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


accessible for the internet business frameworks staff to peruse 
and execute. Educating the shopper on security issues is still 
in the early stages but will end up being the most significant 
component of the e-commerce security planning. Security is 
an essential part of any transaction that takes place over the 
internet. 

Customer will lose his/her faith in e-business if its security 
is compromised. Following are the essential requirements for 
safe e-commerce/transactions. E-commerce Security is a part 
of the Information Security framework and is specifically 
applied to the components that affect e-commerce that include 
Computer Security, Data security and other wider realms of 
the Information Security framework. E-commerce security has 
its own particular nuances and is one of the highest visible 
security components that affect the end user through their daily 
payment interaction with business. E-commerce security is the 
protection of e-commerce assets from unauthorized access, 
use, alteration, or destruction. 

Ecommerce offers the banking industry great opportunity, 
but also creates a set of new risks and vulnerability such 
as security threats. Today, privacy and security are a major 
concern for electronic technologies. 

M-commerce shares security concerns with other technolo- 
gies in the field. Privacy concerns have been found, revealing 
a lack of trust in a variety of contexts, including commerce, 
electronic health records, e-recruitment technology and social 
networking, and this has directly influenced users. Security 
is one of the principal and continuing concerns that restrict 
customers and organizations engaging with ecommerce. In this 
paper we have discussed the topics: Overview of E-commerce 
security, understanding online shopping, steps to place an 
order, purpose of security in E-commerce, different security 
issues in E-commerce, and secure E-commerce guidelines. 


II. OVERVIEW OF E-COMMERCE SECURITY 


E-commerce Security is a part of Information Security of the 
framework and is specifically applied to the component s that 
affect e-commerce include Computer Security, Data security 
and other wider realms of the Information Security framework. 
E-commerce security has its own particulars nuances and is 
one of highest visible security components that affect the 
end user through their daily payment interactions with their 
business. Today, privacy and security are a major concern for 
the electronic technologies. 

M-commerce shares security concerns with other technolo- 
gies in the field. Privacy concerns have been found, to reveal- 
ing a lack of trust in a variety of contexts, including commerce, 
electronic health records, the e- recruitment technology and 
social networking, has directly influenced users. Security is 
one of the principal and continuing concerns that restrict 
customers and organization engaging with ecommerce. E- 
commerce security is the protection of e-commerce assets from 
unauthorized access, use, alteration, or destruction. 

The advent of e-commerce was touted as an incomparable 
solution to traditional business premise that only had the ups 
and no downs. But soon business organizations realized that 


like every solution there are two sides of the coin. Ecommerce 
has its own set of challenges, issues, and Risks. We are going 
to discuss some serious challenges of e-commerce in our 
‘Ecommerce Challenges’ series. 

Eight dimensions of E-Commerce security: 


e Integrity: prevention against unauthorized data modifica- 
tion. 

e Non repudiation: prevention against any one party from 
reneging on an agreement after the fact. 

e Authenticity: authentication of data source. 

e Confidentiality: protection against unauthorized data dis- 
closure. 

e Privacy: provision of data control and disclosure. 

e Availability: prevention against data delays or removal. 

e Encryption: Information should be encrypted and de- 
crypted only by authorized user. 

e Auditability: Data should be recorded in such a way that 
it can be audited for integrity requirements. 


III. CHALLENGES, SECURITY RISKS, SECURITY ISSUES 
AND SECURITY THREATS IN E-COMMERCE. 


A. Challenges 


The advent of e-commerce was touted as an incomparable 
solution to traditional business premise that only had the ups 
and no downs. But soon business organizations realized that 
like every solution there are two sides of the coin. E-commerce 
has its own set of challenges, issues, and Risks. We are going 
to discuss some serious challenges of e-commerce in our 
‘Ecommerce Challenges’ series. 


e Security problems in client/home computers where data 
stored in web “cookie” can be stolen and cracked by 
hostile web-sites, or mail-borne viruses that can steal the 
user’s financial data from the local disk. 

e Eavesdropping and data stealing due to ineffective en- 
cryption or lack of encryption in home wireless networks. 

e Malicious hackers can compromise wireless connections 
even without exploiting cooperative ad hoc networks at 
the transport level. 

e Eavesdropping and data stealing from the  user’s 
keystrokes at Point-of-Sale (POS) terminals in brick-and- 
mortar stores. 

e Risk of loss or theft 

e Eavesdropping and data stealing from the user’s mobile 
and handheld devices. 

e In addition to the issue of security, m-commerce applica- 
tions introduce new and significant privacy risks to end 
users. 

e Eavesdropping and data stealing from networks and dif- 
ferent intermediate communication links. 

e The wireless medium also provides excellent cover for 
malicious users. 

e In addition to contending with the usual Internet security 
threats in online applications, wireless devices introduce 
new hazards specific to their mobility and communication 
medium. 
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e Firewall problems 
e Denial of Service attack 


B. Risks 


In E-commerce, risk is the potential for loss, damage or 
destruction of assets or data. Threat is a negative event, such 
as the exploit of a vulnerability. And a vulnerability is a 
weakness that exposes you to threats, and therefore increases 
the likelihood of a negative event. 


e Bugs or miss-configuration problems in the web server 
that can cause the theft of confidential documents 

e Risks on the Browsers’ side i.e., breach of user’s privacy, 
damage of user’s system, crash the browser etc. 

e Interception of data sent from browser to sever or vice 
versa. 

e Risks in Software security in E-commerce 

e Security risks of wireless devices must be carefully 
analyzed and addressed. 

e WAP gap problem 

e Low level languages 

e Application developers may forgo security features like 
encryption etc. 

e Interesting software development 

e WML script is used to overcome software application 
risks. 


C. Threats and Issues 


E-commerce security is the protection of e-commerce assets 
from unauthorized access, use, alteration, or destruction. While 
security features do not guarantee a secure system, they are 
necessary to build a secure system. The speedy growth of 
Internet has advanced the e-commerce explosion. However, in 
the meantime, the web organizations have brought expensive 
security issues. 


1) Three types of security threats: 
e Denial of service 
e Unauthorized access 
e Theft and fraud 
2) Two primary types of DOS attacks: 
e Spamming 
e Viruses 
3) Other types of attacks: 


e Worms 

e Trojan Horses 

e Illegal access to systems, applications or data 

e Passive unauthorized access 

e Active unauthorized access 

e Changes intent of messages, e.g., to abort or delay a 
negotiation on a contract 

e Masquerading or spoofing 

e Sniffers 


IV. PURPOSE OF SECURITY IN E-COMMERCE. 


E-commerce Security is a part of the Information Security 
framework and is specifically applied to the components 


that affect e-commerce that include Computer Security, Data 
security and other wider realms of the Information Security 
framework. E-commerce security has its own particular nu- 
ances and is one of the highest visible security components 
that affect the end user through their daily payment interaction 
with business. Today, privacy and security are a major concern 
for electronic technologies. Web e-commerce applications that 
handle payments (online banking, electronic transactions or 
using debit cards, credit cards, PayPal or other tokens) have 
more compliance issues, are at increased risk from being tar- 
geted than other websites and there are greater consequences 
if there is data loss or alteration. 


V. SECURE ONLINE SHOPPING GUIDELINES 


1) Shop at secure web sites: 

How can we check whether a Web site is secure or not? 
Secure websites utilize encryption techniques to exchange 
data or information from your computer to the online 
trader’s website. Here’s the manner by which you can 
tell when you are working with a secure website: 

e If you look at the address bar of your browser where 
the Web site address is shown, you are supposed 
to see https://. The ”s” that is shown after “http” 
demonstrates that the Website is secure. Normally, 
you don’t see the ”s” until the point that you really 
move to the order page of the Website. 

e Another approach to find out whether a Web site is 
secure is to search for a shut lock displayed on the 
address bar of the website. If that lock is modifiable, 
you should presume it is not a safe website. 

2) Research the Web Site before You Order: 

Work with organizations you already know. In the event 
that the organization is new, do your homework before 
purchasing their items. If you decide to buy something 
from an obscure organization, start with a low-priced 
order to learn if the organization is reliable. Dependable 
organizations ought to advertise their physical business 
address and at least one telephone number, either cus- 
tomer service or a helpline. 

3) Read the Web Site’s Privacy and Security Policies: 
Every trustworthy online Website provides data about the 
manner it processes your request. It is typically recorded 
in the area entitled —Privacy Policy. You can check out 
whether the vendor expects to share your data with a 
third party or associate organizations. Do they ensure 
these organizations to abstain from marketing to their 
customers? If not, you can anticipate to get spammed and 
even post or telephone requests from these organizations. 

4) Know about Cookies and Behavioural Marketing: 
Online vendors as well as other websites try to keep a 
watch on our shopping and surfing behaviour by using 
*cookies,’ an online tracking system that appends bits 
of code to our web browsers to follow and track which 
websites we look through the internet. Persistent” cook- 
ies stay put on your PC while session” cookies do expire 
when you close the web browser. Online vendors utilize 
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cookies to remember you and accelerate the shopping 
process whenever you visit the next time. You have 
capacity to set your web browser to disable or turn down 
cookies but the trade-off may constrain the functions you 
can perform on the web, and potentially keep you from 
ordering online. For the most part, you should enable 
session cookies to place an order. 


VI. E-COMMERCE SECURITY TOOLS AND MEASURES 


There are different policies used to ensure and measure 
security in E-commerce environment, we shall explain some 
of them in the following sections, which are: 


1) Privacy Policy: 
Develop a privacy security policy that includes defining 
the sensitivity of information, the exposure of the orga- 
nization if that information was likelihood of those risks 
becoming reality. 


2) Cryptography: 
e Secrete key cipher system 
e Public-key cipher system 


3) Certificate: 
Certificate contains such information as the : 


e Certificate holder’s name and identifier 

e Certificate holder’s public key information 
e Key usage limitation definition 

e Certificate policy information 

e Certificate issuer’s name and id 


4) Pretty Good privacy: 
PGP provides a confidentiality and authentication service 
that can be used for electronic mail and file storage 
applications. 


VII. SECURITY IN ONLINE BANKING 


In banks all the functions and activities are safe by using 
security issues in this research in banking management. Open 
account and check the balance and do any transaction and 
delete any account very securely if we know the password of 
any customer. The main feature of the research that the data 
is safe in banking management for long time and open any 
account after a long-time and. This secure banking system 
software is access only by the bank and by customer. A 
customer cannot access the other customer’s account in e- 
bank system. Strong password is used to secure bank account 
of any customer instead of weak password because strong 
password is not easily remembered and used. Feedback can 
be obtained easily as internet is virtual in nature. Customer 
loyalty can be gain. Personal attention can be given by bank to 
customer also quality service can be served. Some studies have 
been designed on survey. The respondent has to answer the 
questions on their own. Some people satisfy own our views. 
But some peoples were not satisfying with us. Respondents 
have adequate time to give well throughout answers. 


VIII. CONCLUSION 


E-commerce is generally viewed as the purchasing and 
selling of items over the internet, but any transaction that 
is finished exclusively through electronic measures can be 
considered as e-commerce. E-commerce and M-commerce are 
assuming great part in online retail marketing and people 
groups utilizing this technology are increasing throughout the 
world on a daily basis. E-commerce business security is the 
assurance of web-based business resources from unauthorized 
access, use, modification, or annihilation. 

Fraudsters are always hoping to exploit online shoppers 
prone to making novice errors like as mentioned above. 
Common mistakes that leave individuals helpless include 
shopping on websites that are not secure, giving out ex- 
cessively individual data, too much personal information, 
and leaving computers open to viruses. In this paper we 
talked about Ecommerce Security Issues, Security measures, 
Digital E-commerce cycle/Online Shopping, Security Threats 
furthermore, the guidelines for protected and secure web-based 
shopping through shopping web sites. 

While many of the risks of desktop Internet-based com- 
merce will pervade, E-commerce itself presents new risks. The 
best strategy for addressing the challenges and security risks 
of Internet-based content is to build security into the platform 
and applications themselves, rather than attempt to introduce 
security patches afterward. For instance, Java provides type 
safety, memory protection, and sandboxing for un-trusted 
content. While history has shown that various implementations 
of the Java virtual machine have not been perfect, its model 
of secure computation is relatively good. Hacking, identity 
theft, credit card stealing, bank information stealing, etc. are 
some of the greatest security issues that hinder the consumer 
from trusting online businesses. Eventually, this means loss of 
potential business for organizations. 

Ecommerce security challenges are however, not limited to 
consumers. Businesses and corporate firms also face security 
challenges as their vital information, records and most impor- 
tantly their reputation is at stake. 
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Abstract—Data science is a multi-disciplinary field that uses 
scientific methods, processes, algorithms and system to extract 
knowledge and insights from structured and unstructured data. 
Data science principles apply to all data — big and small. Big 
data is a term used to mean a massive volume of data generated 
at immense speed. Quality of Big data is of great relevance 
and importance. Quality of Data has to be continually tracked, 
monitored and tuned to be best utilized and be as effective as 
possible while analyzing. 

Data Quality is proportional to its source quality. Building 
a Data Quality framework should consider multiple factors 
such as Data Quality Dimensions, Data Profiling, Data Quality 
Framework, Big Data Quality, Big Data Pre- Processing. Data 
Science has a broad mixture of applications. It is utilized in a 
few fields running from the finance industry to transportation 
and healthcare. 

The Healthcare industry uses data science for medical image 
analysis, drug discovery, health bots or virtual assistants, and 
predictive modeling for diagnosis. The prediction of the existence 
of heart disease is an important role to prevent from heart attack. 
A data science framework which addresses the how to discover 
the chances of existence of heart disease by applying different 
classification algorithms, influence and distribution of various 
parameters are playing major role in disease prediction along 
with visualizations on Cleveland cardiovascular medical records. 
Data science framework is used for making accurate predictions 
using past data by considering various data insight features. 
Framework for Medical Data Analysis is developing for a moving 
target as the very nature of biomedical research based on big 
data requires an environment capable of adapting quickly and 
efficiently in response to evolving questions. 

Index Terms—Data science, EDISON data science framework, 
eTRIKS analytical environment, big data, big data quality, 
heart disease prediction model, data science applications, mental 
health, data mining, visual data exploration. 


I. INTRODUCTION 


structured data where it comes from, what it represents 
and the waste by which it can be transformed into valu- 
able input and resources to create business and ID strategies. In 
contrast data analytics focuses on processing and performing 
analysis on existing dataset. Data science comprises automated 
machine learning methods to investigate vast amounts of data 


De SCIENCE is a multidisciplinary field of raw, un- 
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Fig. 1. Data science application in different domains 


and to mine information from them. Data science is a subject 
that rose chiefly from need, for real applications instead of as 
an investigation zone. Consistently, it has created from being 
used in the modestly tight field of estimations what is the 
more examination to being a comprehensive closeness in each 
perspective of science and industry. Big data is an evolving 
phase which means large volumes of both structured, semi- 
structured and unstructured data that it is difficult to process 
using traditional database and software techniques. Big Data 
helps in acquiring, processing and analyzing large amounts of 
heterogeneous data to derive valuable results. 

Quality of information is affected by size, speed and format 
in which data is generated. Hence, Quality of Big Data is of 
great relevance and importance. Quality of Data has to be 
continually tracked, monitored and tuned to be best utilized 
and be as effective as possible while analyzing. Such data 
can come from several different sources such as business 
transaction systems, customer databases, mobile applications, 
websites, machine-generated data and real-time data sensors 
used in internet of things environments. This comes with 
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complexities commonly known as 5Vs i.e. Volume, Variety, 
Velocity, Veracity, Value. 

Data science framework is used for making accurate pre- 
dictions using past data by considering various data insight 
features. The EDISON Data Science Framework is a collec- 
tion of documents that define the Data Science profession. 
Freely available, these documents have been developed to 
guide educators and trainers, employers and managers, and 
Data Scientists themselves. The EDSF components describe 
methodologies and tools to support and assist educators and 
trainers in developing their curricula and other training tracks 
that will help deliver DSP with the skills and competences 
required by the job market. 

The EDISON vision for building the Data Science pro- 
fession will be enabled through the creation of a compre- 
hensive framework for Data Science education and training 
that includes such components as Data Science Competence 
Framework (CF-DS), Data Science Body of Knowledge (DS- 
BoK) and Data Science Model Curriculum (MC-DS). The 
CF-DS includes common competences required for successful 
work of Data Scientists in different work environments in 
industry and in research and through the whole career path. 
The future CF-DS development will include coverage of the 
domain specific competences and skills and will involve do- 
main and subject matter experts. The CF-DS provides a basis 
for the definition of the Data Science Body of Knowledge (DS- 
BoK), the knowledge needed by the Data Science practitioners 
to perform all the data related processes of his/her profession. 
The BoK defines the content of a curriculum and needs to be 
mapped to desired Learning Outcomes, which in its own turn 
are defined by required competences for target professions. 
Model Curriculum can be regarded as a blueprint that can 
be used by educators and trainers to develop curricula at 
various educational institutions and for different target groups. 
Definition of MC- DS should incorporate best practices and 
be grounded in education theory to achieve required learning 
outcome. 

In the context of the European Translational Information 
& Knowledge Management Services (eTRIKS), we developed 
the eTRIKS Analytical Environment (eAE) in answer to the 
needs of analysing and exploring massive amounts of medical 
data. The eAE is a modular framework which enables the 
analysis of medical data at scale. Its modular architecture 
allows for the quick addition or replacement of analytics tools 
and modules with little overhead, thereby ensuring support of 
users as the data analytics needs and tools evolve. The eAE 
is flexible enough to support a variety of use cases across 
the biomedical domain. Our goal in developing the eAE is 
to enable the scalable exploration of multi-modal medical 
data using a flexible and modular architecture. The eTRIKS 
Analytical Environment to provide users with an analytics 
environment which has a user friendly frontend, has endpoints 
which can be easily integrated into tools, is modular and finally 
is also scalable in support of analysing large amounts of data. 

The data science is mainly used in the finance industry, 
environment application, aviation, agriculture, E-commerce 


application and Health care industry. The Healthcare industry 
uses data science for medical image analysis, drug discovery, 
health bots or virtual assistants, and predictive modeling for 
diagnosis. The essential and premier utilization of Data science 
in the healthcare business is through clinical imaging. There 
are different imaging procedures like X-Ray, MRI, and CT 
scans. The data science is used for heart disease prediction. 
Heart disease symptoms depends on the sex such for Men 
are more like to get chest pain and Women also have chest 
pain and difficulty in breathing and fatigue. This disease kills 
a large number of people each year .The prediction of the 
existence of heart disease is an important role to prevent from 
heart attack. Similarly the mental health professionals can use 
to solve challenges they face using data science. 

Understanding public mental health issues using data sci- 
ence and finding solutions based on the findings from the 
data science projects can be complex and requires advanced 
techniques, compared to conventional data analysis projects. 
The data science is a multi-disciplinary field that uses sci- 
entific methods, processes, algorithms and system to extract 
knowledge and insights from structured and unstructured data. 
A “concept to unify statistics, data analysis , machine learn- 
ing and their related methods” in order to “understand and 
analyze actual phenomena” with data. Employees techniques 
and theories drawn from many fields within the context of 
mathematics, statistics and information science. 


II. EDISON DATA SCIENCE FRAMEWORK 


The EDISON vision for building the Data Science pro- 
fession will be enabled through the creation of a compre- 
hensive framework for Data Science education and train- 
ing that includes such components as Data Science Compe- 
tence Framework (CF-DS)[1], Data Science Body of Knowl- 
edge (DS-BoK)[2] and Data Science Model Curriculum(MC- 
DS)[3].Figure 1 below illustrates the main components of the 
EDISON Data Science Framework (EDSF) and their interre- 
lations that provides conceptual basis for the development of 
the Data Science profession. 
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Fig. 2. Edison Data Science Framework Components 


A. Data Science Competence Framework 


The CF-DS includes the common hard and soft skills 
(i.e., technical and collaborative skills, also called social or 
professional intelligence) required to have Data Scientists 
engaged in a team and to act in the modern agile data-driven 
Enterprise, as well as the subject-specific knowledge and skills 
allowing to work in different scientific and technical domains. 
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The EDISON CF-DS development follows the European e- 
Competences Framework (e-CF3.0) guiding principles[5]. The 
EDISON study on Data Science competences revealed that two 
new groups of competences should be included that have not 
been explicitly identified in previous studies and frameworks. 
The figure 2 presents the following competences. 


Fig. 3. Data Science Competence Groups 


1) Three competence groups identified in the NIST docu- 
ment and confirmed by analysis of collected data are 


e Data Analytics including statistical methods, Ma- 
chine Learning and Business Analytics 
e Engineering: software and infrastructure 
e Subject/Scientific Domain competences and knowl- 
edge 
2) Two identified competence groups that are highly de- 
manded and are specific to Data Science are 


e Data Management, Curation, Preservation (new) 
e Scientific or Research Methods (new) 


B. Data Science Body of Knowledge 


The CF-DS provides a basis for the definition of the Data 
Science Body of Knowledge (DS-BoK), the knowledge needed 
by the Data Science practitioners to perform all the data 
related processes of his/her profession. The BoK defines the 
content of a curriculum and needs to be mapped to desired 
Learning Outcomes, which in its own turn are defined by 
required competences for target professions. The DS-BoK 
should contain the following Knowledge Area groups (KAG) 
that are defined after CF-DS competence groups: 


e KAGI-DSDA: Data Analytics group including Machine 
Learning, statistical methods, and Business Analytics 

e KAG2-DSENG: Data Science Engineering group includ- 
ing Software and infrastructure engineering 

e KAG3-DSDM: Data Management group including data 
curation, preservation and data infrastructure 

e KAG4-DSRM: Scientific or Research Methods group 

e KAG5-DSBP: Business process management group 

e KAG6-DSDK: Data Science Domain Knowledge group 
includes domain specific knowledge 


C. Data Science Model Curriculum 


Model Curriculum can be regarded as a blueprint that can 
be used by educators and trainers to develop curricula at 
various educational institutions and for different target groups. 
Definition of MC-DS should incorporate best practices and 
be grounded in education theory to achieve required learning 
outcome.The following learning and instructional models are 
considered: Bloom’s Taxonomy,Constructive Alignment and 
Problem Based Learning, Competence Based Learning, that 
have being partly evaluated in early authors’ works [6, 7, 
8].From the practical perspective, the Model Curriculum rep- 
resents a tool for 


e supporting the development of new Data Science pro- 
grammes (including selection of appropriate learning 
units) tailored according to proficiency levels required to 
address competences required for identified DSP profiles 

e assessing the compliance of existing Data Science pro- 
grammes, facilitating the elicitation of potential gaps 
related to specific competence groups and knowledge 
areas implied by target professional profiles. 


Hence, the Model Curriculum helps matching the supply- 
side and demand- side requirements for Data Science educa- 
tion. The formal MC-DS definition will create a basis for Data 
Science educational and training programmes compatibility 
and consequently Data Science related competences and skills 
transferability. 


III. BIG DATA QUALITY FRAMEWORK 


Big data is an evolving phase which means large volumes 
of both structured, semi-structured and unstructured data that 
it is difficult to process using traditional database and software 
techniques. Quality of information is affected by size, speed 
and format in which data is generated. Hence, Quality of Big 
Data is of great relevance and importance. Such data can come 
from several different sources such as business transaction 
systems, customer databases, mobile applications, websites, 
machine-generated data and real-time data sensors used in 
internet of things (IoT) environments. 


A. Big Data Life Cycle 


The Data in a Big Data system traverses four phase within 
the Big Data Lifecycle, these are: Data Origin Identification, 
Data Acquisition and Cleansing, Data Aggregation and Stor- 
age and Data Analysis, as depicted in Figure 3. 


DATA ORIGIN IDENTIFICATION 


DATA TRANSFORMATION DATA PRE-PROCESSING DATA COLLECTION 


DATA AGGREGATION & STORAGE 


DATA ANALYSIS 


Fig. 4. Big Data Lifecycle 


e Phase 1: Data Origin Identification 
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This phase is concerned with the raw data being generated 
from a variety of sources and in abundance. The sources 
may include social sites, financial applications, customer 
relation applications, media web sites, images, etc. It is 
critical to understand the source(s) of the data, 560 and 
identify it’s veracity or reliability level as the next phase 
uses this understanding to process the data. 

e Phase 2: Data Acquisition and Cleansing 
This phase assimilates the data from many sources. This 
raw data may be messed up with anomalies including 
corrupted values, badly formatted and unsuitable for 
consumption by the Big Data application, a combination 
of structured, semi-structured and unstructured data. Such 
data needs to filtered and cleansed, reformatted and struc- 
tured, deduped, remove illegal values and compressed. 
These pre-processing steps are crucial to transform the 
data to levels suitable or valuable for analysis. 

e Phase 3: Data Aggregation and Storage 
This phase ensures the from many heterogenous sources 
is suitable aggregated with joins across source databases 
or stored within databases or file formats on which the 
analysis is planned. 
Phase 4: Data Analysis 
This phase infuses relevance and sense into gathered data. 
This is a complex and evolving process that executes by 
comparing data characteristics to identify patterns using 
corrections as per domain knowledge or experience. The 
analysis results aim to help the users aware of the current 
state, make forecasts and informed decisions. 


B. Importance of Data Quality 


Quality of Data has to be continually tracked, monitored 
and tuned to be best utilized and be as effective as possible 
while analyzing. Building a Data Quality framework should 
consider multiple factor with critical ones being business 
domain, source(s) of data, structured/unstructured data. 

1) Data Quality Dimensions: Data Quality dimensions are 
a means to assess the quality of data. These may be Intrinsic 
or/and Contextual. Even though there is no standard or uni- 
versal regulatory definition on these quality dimensions, the 
attempt here is to broadly agree on the commonly accepted 
ones. As such the Contextual may vary per business domain, 
application or relevance. The most popular Intrinsic dimen- 
sions include Accuracy, Consistency, Uniqueness, Timeliness, 
Validity and finally, Completeness. 

2) Data Profiling: Profiling data revolves around estab- 
lishing a rule’s framework to ensure efficient assessment of 
the quality of the data supported by a specific definition and 
characteristics on the quality of the data. To ensure quality of 
data in a Big Data system data may assessed and transformed 
through numerous iterations in an effort to cleanse and also 
progress from an unstructured to a more structured state. 

3) Data Quality Framework: In [29] Data Quality frame- 
work establishes rules that aim to ensure data with enhanced 
quality. Processes to cleanse data, dedupe data, remove cor- 


rupted data instances and many more sub-process form part 
of the quality framework. 

4) Big Data Quality: Data Quality for the still evolving and 
growing field of Big Data is in itself a highly complex subject. 
Large organizations earlier believed having captured data from 
various business processes, multiple divisions, sales, profits, 
geographies, and locations parameters, etc. would empower 
them to magnify their business and spread in more areas 
strategically. However, the challenge remained how to tap and 
mine efficiently the huge volumes of data in a qualitative 
manner using standardized quality processes that also cleanse 
and improve the data quality as part of the Big Data life cycle. 
Ensuring and transforming the data to a quality one is crucial 
for any industry’s Big Data platform to be able to analyze the 
data accurately and assess patterns that help in devising future 
strategies accurately or as best possible. 


IV. MEDICAL DATA ANALYSIS FRAMEWORK 


The complexity, diversity, context richness and size of 
biomedical data, demonstrate the limitations of current sys- 
tems. Each data type represents its own set of challenges and 
requirements, either because it is a high dimensional format. 
The challenges of developing systems for analysing multi- 
modal medical data consequently are: 


e The massive amounts of data needed for analysis. 

e The associated need of a scalable infrastructure. 

e The quickly evolving needs of analytics, processing, 
integration and analytics. 


In the context of the European Translational Information 
& Knowledge Management Services (eTRIKS), we developed 
the eTRIKS Analytical Environment (eAE) in answer to the 
needs of analysing and exploring massive amounts of medical 
data. The eAE is a modular framework which enables the 
analysis of medical data at scale. Its modular architecture 
allows for the quick addition or replacement of analytics tools. 


A. Medical Analytics Background 


1) Causality: To identify if any causal relationship exists 
between the risk factor and the disease[11]. 

2) Testing: Testing has been of crucial importance early 
on in biomedical research. Poor and complex signals, curse 
of dimension-ality, computational needs of Bayesian to name 
only a few of the problems that researchers have faced. Many 
techniques have been successfully applied to overcome these 
problems.Principal component analysis (PCA) is a frequently 
used signal separation technique to discover potential sub- 
groups of the dataset[13]. 

3) Clustering: Cluster analysis or clustering is the task of 
grouping a set of objects in such a way that objects in the 
same group (cluster) are more similar to each other than to 
those in other groups. 

4) Time series: Biological processes are often dynamic, 
thus researchers must monitor their activity at multiple time 
points. Generating time series expression data has become 
one of the most fundamental methods for querying biological 
processes. 
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5) Prediction: prediction can be undertaken within any of 
the several approaches to statisticalinference to the whole 
population, and to other related populations,which is not 
necessarily the same as prediction over time. 


B. The architecture of the eTRIKS Analytical Environment 
Endpoints 


Scheduling 
Layer 


Caching 
Layer 


Lv AN 


Openlava 


Computation Layer 


Fig. 5. A schematic representation of the architecture of the eTRIKS 
Analytical Environment 


1) General Environment: The operating system used on 
both the physical machines, virtual machines and containers 
within this architecture is Ubuntu 16.04 LTS. 

2) Endpoints Layer: For data exploration, the eAE relies 
on two sets of tools: tranSMART and a modified version of 
Jupyter. 

3) Caching Layer: eAE relies on two different tools for 
caching layer: the NoSQL database MongoDB 3.2.5 in the 
eAE backend and SQL database Postgresql 9.3 in tranSMART. 
MongoDB — which does not require a schema— is an 
excellent solution for adapting to any kind of data and acting 
as a cache. 

4) Scheduling Layer: For scheduling and monitoring of the 
clusters and jobs in the eAE. 

5) Computation Layer: The cloud platform chosen to sup- 
port the on demand resources — to support scaling out when 
more compute-heavy or multiple computations need to be 
executed — is Openstack Liberty. 


C. Experimental Evaluation 


1) Compute Scalability: We use benchmarks to evaluate the 
compute performance and scalability of the platform in terms 
of data size, number of executors and number of users. 

2) Scheduling Scalability: To evaluate the orchestration 
scalability, we submit the same job concurrently an increasing 
number of times. 

3) User Scalability: The user scalability is still ongoing 
as the numbers of client applications is steadily growing. So 
far, there are ten Jupyters users actively using the eTRIKS 
Analytical Environment and one tranSMART demo server. 
There are no performance issues so far and the compute layer 
is efficiently shared across all users. 


V. HEART DISEASE PREDICTIONS 


The data science framework for heart disease predictions is 


shown in Figure 6: 
— — Model Selection > Accuracy 
Interpreting 


Fig. 6. Data Science framework for heart disease prediction 


Jata Acquisition From 


Data Cleaning 
Records 


Heart Patients 


1) Data Acquisition: Medical repository is used for pre- 
diction of heart disease which is obtained from medical 
reports which are collected from Cleveland through UCI 
repository[22]. The database contains 300 records with 76 
various attributes but only 14 of these attributes is used 
for heart disease prediction like Age(in years), Sex(Male or 
Female), chest pain type like typical angina, atypical angina, 
non-anginal pain, trestbps, cholesterol, fasting blood sugar (fbs 
į 120mg/dl), rest electro cardiograph, thalach (maximum heart 
rate), angiographic status. 

2) Data Cleaning: This step includes the removal of noisy 
data, identification of not available and not applicable data 
items and treatment of that data segments are required to be 
performed. In case the data contains unfiltered and irrelevant 
segments, then the results of the analysis will not unkind 
anything. Hence one of the crucial steps as removing dummy 
values from the dataset has performed. In the Heart disease 
dataset, there are many missing values in each attribute. Hence 
the missing values are replaced by the median values of the 
data set[23]. The Heart disease dataset Target variable consists 
of “Yes“ and “No“ as target labels. Converting strings to 
numbers, Assigning Yes as | and No as 0 respectively. The 
Heart disease UCI data set is already data wrangled. So while 
processing the data the algorithms can work smoothly. 

3) Data Explore: Checking the relation between variables 
is one the crucial task. This shows how one variable is 
affecting the other variable. Data visualization plays a key role 
in this analysis. Without using orthodox methods we visualize 
data to comprehend. In this paper the major visualizations are 
done bar, box, heat map and ROC plots are used. 

4) Attribute Selection: Selecting the required features is one 
of the most important tasks to get the best results and less 
time to train the model. Hence, to decrease the training time 
and evaluation time. And increase the accuracy of prediction. 
Hence to find the best features Information gain is used to find 
the features which plays major role in predicting the disease. 
After applying on Heart disease dataset the predicted output 
is almost equal to combining all the features together. The top 
featured were age, sex, fasting blood sugar ¿ 120,type of chest 
pain, target. 

5) Model selection: 


1) Logistic Regression(LR): 
Logistic regression is widely used in machine learning 
to solve classification problem. Where sigmoid function 
is used to determine the predicted value with the help 
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of threshold value. It is used to apply on categorical 
variables that are on the variables which can be classified. 

2) Support Vector Machine (SVM): 
Vector Support Machine is a supervised classification 
system used primarily for classification of knowledge into 
various classes.It uses hyperplanes as a decisionmaking 
limit between different classes.SVM analyse etiquette 
learning information and then classifies data about what 
it has learned throughout the training.After plotting the 
data a line is drawn to separate the classes. 

3) Naive bayes(NB): 
Naive bayes is based on theorem bayes with independent 
characteristics. The relationship between the likelihood 
of the hypothesis before the evidence is obtained and 
the probability after data learning is described in this 
theorem. 

4) Random Forest(RF): 
Logistic Random forest is an ensemble grouping of 
decision models. Ensemble models incorporate the effects 
of various models. The most widely used predictive 
modelling and machine learning software is a flexible 
algorithm that can perform both regression and classi- 
fication. The more number if decision trees in the forest 
the more the accuracy. 

5) XG-Boost(XG): 
XG-Boost is a structured or tabular information algorithm 
that is used. The gradient-boosted engines are imple- 
mented for speed and performance. The speed of execu- 
tion of XG Boost is very quick compared to other gradient 
boosting implementations. It manages structured data sets 
on problems of predictive modeling classification and 
regression. The next section is discussed on results. 

6) Model testing with Accuracy Parameters: 
Precision: Precision is quantity of units correctly pre- 
dicted as faulty. 


TP 
TP+FP 
Where, TP is True positive, FP is False positive 


precison = 


a) Recall: Recall is the ability of a classification model 
to identify all relevant instances. 

b) F-measure: F-measure is a measure which is used to 
calculate the how good the model is predicting 

c) Reciever Operating Characteristics: This shows the true 
positive vs. the false positive in the model where 
threshold indicates positive identification. 


VI. DATA SCIENCE IN PUBLIC MENTAL HEALTH 


Mental illness refers to all diagnosable mental disorders 
which are characterized by abnormalities in thinking, feelings 
or behaviours[15]. Mental illness is very common Mental 
health is vital for overall well-being of human being.The Joint 
Strategic Needs Assessment (JSNA) is a process by which 
local governments and clinical groups assess the present and 
future healthcare and wellbeing needs of the local community 
to inform local decision-makers in UK. 


Mental health and wellbeing JSNA toolkit 2017/18 
Data profile 


Prevalence & 
incidence 


Protective 
factors 
Quality & 
outcomes 
Content 
* 100+ overview metrics 
* GP & ward level metrics 
+ Each domain follows life course 


+ Gateway to a range of NMHIN topic 
based profiles 


Knowledge guide 


Content 
+ Bite-sized cut & paste sections 
* Focus on prevention, wellbeing, risk 
profiling and community assets 
+ Intelligence on policy, case for 
change, data and interventions 
+ Guide follows life course 


Mental health JSNA | Meeting the need -what makes a | Mental health primary prevention 


‘goods JSNA for mental health? return on mawa yayi tool & report 
Mental Heath 2016 (London Se conomecs) 


Supporting components 


Fig. 7. JSNA Toolkit Structure 


A. Visual Data Mining 


Visual data mining involves the invention of visual represen- 
tations that could be applied in all three data-mininglife cycle 
stages, as partitioned to the data preparation, model derivation 
and validation stage. In this paper, we proposed novel visual 
data mining framework for Mental Health study. 


B. Visual Data Exploration 


Visual Data Exploration technique and data mining tech- 
niques will be used gain hidden knowledge and understand the 
correlation between factors associate with mental health. Both 
techniques require user involvement at different phases, and 
visualisation will be used to support the knowledge discovery 
and pattern recognition. Exploratory Data Analysis is the 
special data mining task which visualization is playing a 
major role. Model visualization is the process of using visual 
techniques to make the discovered knowledge understandable 
and interpretable by humans. The degree of automation of 
data mining algorithms varies considerably as different lev- 
els of human guidance and interaction are usually required. 
However, it is the algorithm, not the user, that will look for 
patterns[16] .Geospatial data have become important as many 
visual analytics approaches require finding spatial patterns 
and relationships between the data points. A key difference 


Visual Data Exploration 


User interaction 
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Fig. 8. Visual Analytics Process 


between data mining and visual data exploration is that visual 
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data exploration is a completely human guided process. Visual 
exploration of the data and the results from the models have 
been considered as an interesting application and attracted 
attention from both the academic and industry communities. 
Visual analytics methods can be selected for your solution in 
a variety of ways, ranging from simple bar plot to complex 
geo visualisation plots. Domain knowledge is vital for visual 
data exploration and it will add lot of knowledge to this step 
to understand and interpret the results. 


VII. DATA SCIENCE APPLICATION AND ITS PLATFORM 


Data Science has a broad mixture of applications. It is 
utilized in a few fields running from the finance industry to 
transportation and healthcare. Different enterprises utilize Data 
Science to help their creation, improve on more intelligent 
choices, and create original objects that are custom-made for 
consumer needs. Data science is mainly used in the healthcare 
industry, fiancé industry, environment application, and aviation 
agriculture and e-commerce applications. 


A. Healthcare industry 


The Healthcare industry uses data science for medical image 
analysis, drug discovery, health bots or virtual assistants, and 
predictive modeling for diagnosis. The essential and premier 
utilization of Data science in the healthcare business is through 
clinical imaging. There are different imaging procedures like 
X-Ray, MRI, and CT scans. Every one of these systems 
imagines the inward pieces of the human body. Customarily, 
specialists would physically assess these pictures and discover 
inconsistencies inside them. Nonetheless, it was regularly hard 
to discover minuscule deformations, and thus,specialists could 
not recommend an authentic decision[9]. 


B. E-commerce business 


Data science is particularly significant in the internet busi- 
ness and retail industry. Data science is mainly used in the 
e-commerce business for predicting sales and goods services. 
Data Science is additionally intensely utilized in the rec- 
ommendation system. This strategy uses customer historical 
purchase behavior and recommends the product. Currently, the 
drone is used to sell products from business people to con- 
sumers. Retail marks break down information to make client 
profiles and gain proficiency with his/her irritated focuses and 
market their item appropriately to push the client towards 
buying. Recommendation engines are the most significant 
devices in a retailer’s agency. Retailers influence these motors 
to drive a client towards purchasing the item. Giving proposals 
assists retailers with expanding deals and to direct patterns[9]. 


C. Agriculture application 


Agriculture is the base of the world economy, still, it 
experiences a stacking number of debacles, for example, 
environmental change, capricious storm or absence of it, dry 
seasons, floods, movement of ranchers towards the urban 
communities looking for better-paying occupations, and then 
some. Individuals engaged with farming are the last to be dealt 


with, in any event, when they are the person who takes care of 
the entire world. Agriculture industry data science is mainly 
applied to managing crop diseases and predicting rainfall and 
yield predictions[10]. 


D. Finance industry 


Data science plays a vital role in automating the financial 
task. The Bank industry uses data science to predict credit 
card fraud detection and finding whether a particular customer 
loan can be given or not. Data Science is broadly utilized in 
zones like risk investigation, client the board, and falsification 
recognition. Natural langue processing is used in fiancé indus- 
try to automate the task such as the online guidance system 
and smarter governance[12]. 


E. Environment Application 


Ecological information is developing in multifaceted nature, 
size, and goals. Directing to the kinds of massive, multidis- 
ciplinary issues looked by the present ecological researchers 
requires the capacity to use accessible information and data 
to educate dynamic. Effectively incorporating heterogeneous 
information from numerous sources to help all-encompassing 
investigations and extracting new information requires the 
utilization of data science[14]. 


F Data science in aviation 


Because of the quick improvement of cutting edge inno- 
vations these days, an enormous measure of constant infor- 
mation to flight data, flight execution, air terminal conditions, 
air traffic conditions, climate, ticket costs,travellers remarks, 
team remarks, and so forth. are generally accessible from 
a various arrangement of sources, including flight execution 
checking frameworks, operational frameworks of carriers and 
air terminals, and online networkingstages. Advancement of 
information examination in flight and related applications is 
additionally developing quickly. 


VIII. NEED FOR A DATA SCIENCE WORKFORCE 
FRAMEWORK 


As data science grows in usage, and teams grow in 
size,specialization within the team naturally occurs. For ex- 
ample, data science teams often have people that focus on 
analytics(often called data scientists) and others that focus on 
collecting/cleaning data (often known as data engineers). In 
reality, many specializations are for “vertical” subject matter 
experts, such as data architects, big data engineers, data 
analysts, or machine learning experts. Being a “horizontal” 
data scientist refers to one having general expertise in several 
disciplines sufficient to guide the work of a diverse team of 
specialists. While these roles are starting to be commonly used, 
little has been published on them. In fact, in a literature review 
on team data science processes, the concept of roles was 
not identified. The role of data scientist is often assigned to 
anyone who performs any activity that touches data, including 
data management, data processing systems, data analytics, 
and so on. However, the skills required for these different 
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tasks vary greatly. Big data represents a significant change 
in the techniques and technologies used for data-intensive 
computing. The move to parallelization has added complexity 
to big data solutions and introduced the need for several 
specialized skills. Related, the term “data science” has become 
ubiquitous, used to describe any activities that touch data. 
Consequently, it is difficult to ascertain what skills are needed 
to perform the specific tasks required to build and deploy big 
data analytics (BDA) systems. This is compounded by the fact 
that the field is evolving from work performed by an individual 
that does data science to a team that does data science. 


A. The Need for Workforce Descriptions 


The main motivation for workforce descriptions is the need 
to identify, recruit, train, develop, and maintain an appropri- 
ately skilled workforce by providing a common language to 
categorize and describe the type of data science work that 
needs to be done. While the roles and vocabulary are data 
science specific, the need for workforce descriptions is not 
limited to data science. In other words, data science is not 
the only discipline that has required clarification of roles and 
skills. For example, cybersecurity is another domain where this 
need has existed. For data security, the U.S. National Institute 
for Science and Technology (NIST) developed the National 
Initiative for Cybersecurity Education (NICE)Cybersecurity 
Workforce Framework[17] that clarifies the categories, spe- 
cialty areas, and work roles for cybersecurity practitioners. 
In addition, they provided lists of tasks, knowledge, skills, 
and ability descriptions, mapping them to work roles. Another 
effort for to provide workforce definitions was the U.S. Depart- 
ment of Defense Cyber Workforce Framework[18] . This work 
is ongoing, including revisions to a companion document A 
Role-Based Model for Federal Information Technology/Cyber 
Security Training. The benefits listed in the NICE report apply 
equally well to the domain of data science and include the 
following: 


e Employers-track staff skills, training, and qualifications; 
improve position descriptions; develop career paths; and 
analyze proficiency. 

e Educators—develop curriculum and conduct training for 
programs, courses, and seminars for specific roles. 

e Technology Providers—identify work roles, tasks, and 
knowledge, skills, and abilities associated with their prod- 
ucts. 

Of course, this work would also be of value for students (in 
understanding how their education maps to different possible 
roles) and employees (understanding roles where their skills 
can be leveraged most effectively). Hence, providing job 
titles and job descriptions that more clearly identify tasks, 
knowledge, skills, and abilities would benefit the data science 
community and remove the overloading of the term data 
scientist. 


B. Skills vs Roles 


While there are some skills in common across different 
types of data science roles, some skills might be specific to 


a particular role. Just as the NICE workforce framework has 
knowledge, skills, and abilities that can apply to multiple work 
roles, it will be important to ensure that each data science work 
role is similarly described. More generalist practitioners would 
be able to fit into a number of roles, but non-overlapping role 
descriptions are important to ensure clarity. In addition, skills 
might vary based on the type of data science project, such 
as projects with more or less discovery required within the 
analysis[19]. 


C. Challenge Due to Lack of Process Model 


There are several challenges related to the development of 
a data science workforce framework but perhaps the most 
significant one is that there is not an agreed upon process 
model for data science.In the late 1990s, the Cross-Industry 
Standard Process model for Data Mining (CRISP-DM) was 
developed by a consortium to resolve the conflicts between 
individual data mining process models, to promote commu- 
nication in the discipline, and to ensure greater data mining 
success. This model is still the framework followed by the 
largest number of practitioners but it predated cloud, big data, 
machine learning, agile, “the Internet of Things”, and so on, 
and it did not consider system development or management 
processes[20]. 


D. In Relation to Software Development Lifecycles 


Most analytic system development results in situational 
awareness through reports or business intelligence. Software 
development lifecycles (SDLCs) are geared toward this kind of 
requirements-driven analytics system development. Advanced 
analytics systems are, however, outcomes-driven and require 
experimentation for the choice of data and data features, model 
building, and model evaluation and optimization. Some data 
science skillsets would overlap those used in SDLCs, but for 
completeness, roles would need to be distinctly described. 
Care should be taken, however, to align the data science and 
SDLC models, in particular lining them up with agile and 
DevOps standard methodologies. 


IX. QUALITY ASSURANCE FOR DATA SCIENCE 


Data science would be unique among formal sciences due 
to its heavy utilization of the scientific method, dependence 
on algorithmic research methods, partial automation of the 
scientific method, engagement of both social sciences as well 
as natural sciences, sometimes concurrently, etc. Data science 
will have principles such as a piece of data is meaningful and 
theories derived are valid only within a sufficiently defined 
context (with a set theoretic basis), concepts such as critical 
mass of data required to carry out certain verification, deploy 
a certain verifying method, build a certain model, etc., effects 
such as increase in the verifiability of data (and theories that 
explain them) with the increase in volume, relevance, accuracy, 
coherence, the number of sets an element of data is a member 
of, with respect to data, Philosophy of Data Science discussing 
underlying assumptions pertaining to meaningfulness, etc. 
What is as equally important as providing data science as 
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a service is providing data science as a science. With the 
growing Volume of data crossing successive thresholds cor- 
responding to ever increasing critical masses of data, different 
levels, mechanisms and contexts of cross checking would 
become possible and the scenarios that such data represent 
could only be represented using complex systems beyond a 
certain limit. Emphasis on contextually and multi-contextually 
in data science has the potential to impact the philosophy of 
scientific method. 


A. Discovery as a Service 


Amount, sources, variety, complexity, etc. of data available 
grow by leaps and bounds and finding, selecting and accessing 
right data sources depend on semantics, cost, quality, resource 
constraints, privacy requirements, legal matters, etc. Volume of 
data that would continuously emanate from data sources would 
make it almost impossible for the search engines to cache them 
and this would make the distinction between “searching for 
data” &amp; “searching for data sources“ and consequently 
search results would soon contain data sources rather than 
data.Whether going to have (a large number of) data tanks that 
would be filed with historical data from all those data sources 
and preserve them for future use is a moot point. Semantics 
and other parameters pertaining to data sources mentioned 
above would determine how data sources would be selected for 
composition / aggregation. Services, which provide (raw) data 
from sources, will have more often than not heavy payloads 
and the progress made in data intensive computing, storage 
and communications would be helpful. 


B. Composition Of Data Rich Services 


Data rich services are primarily characterized by not only 
their (substantial) supply of data from (original and / or 
reliable) sources but also they closely represent real world 
situations, which give rise to such data. Data is required From 
multitude of sources for filling data reservoirs for rich Analysis 
and composition or aggregation of data rich services Would 
bring data together. While the involvement of human intelli- 
gence is essential for selecting, etc. of proper data sources, the 
vastness of the number of data sources and their dynamicity, 
complexity, heterogeneity, etc. necessitate the automation of 
composition and aggregation of services that transport data 
from various sources. One important area that is worth paying 
attention to is data de-duplication[21] of data reservoirs to 
ensure the coherence of knowledge contained in them because 
due to a number of reasons such as different sources providing 
same data, ability to derive certain data from other data, etc. 
data flowing from services, composed or aggregated on the 
fly, can fill data reservoirs with redundant entries. 


C. Simulators 


What would operate on top of data reservoirs would be 
simulators that would closely reconstruct the real world or its 
scenarios that gave rise to data contained in data reservoirs 
and to be processed by simulators. As the real world is 
inherently complex, methods used in complexity theory suchas 


Non-linear science, Bifurcation theory, Network theory,Game 
theory, Chaos theory, Information theory, Superstatistics, Cel- 
lular automata, Agent based modelling, Data mining, etc. 
would make simulators based on complexity science contain 
all relevant human knowledge within them to ensure that 
simulations reflect the state of the art. 


D. Future Trends 


Providing Simulation-as-a-Service would ensure that inter- 
ested or tolerable parties beyond a few elites would be able to 
build and test hypotheses most probably via novice friendly 
visual tools thus enabling crowd data science and They would 
further be able to provide feedback on the general consistency 
between data and models that fit (or do not fit) them so that 
quality of data (as well as theories) and therefore QoK, as a 
whole, is assured from multiple perspectives. 


X. How TO USE STOCK DATA FOR DATA SCIENCE 
EDUCATION 


A. Software Architecture 


The simulation software (as Fig 9) contains three compo- 
nents: Yahoo Finance server, which provides free and real- 
time stock data. a local server, which is a server hosted in 
department. Its major role is to collect historical data and 
provide necessary computing for students and user interface, 
which is the self-developed trading platform to demonstrate 
stocks information and to enable student to implement trading 
strategy. The local server and user interface are implemented 
based on the web framework in R, named Shiny. 
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Fig. 9. Software Architecture 


B. R Packages 


In this project, R studio is used as development environ- 
ment. The version of R used in this project is 3.3.2, which was 
released in October 2016. Some key R packages are listed as 
follows. 

e Shiny: 

Open source R package that provides an elegant and 
powerful web framework for building web applications 
using R. 

e Quantmod: 
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The package designed to assist the quantitative trader in 
the development, testing, and deployment of statistically 
based trading models. 

Dygraphs: 

An R interface to the dygraphs JavaScript charting library. 
It provides rich facilities. 

Magrittr: 

The package providing a mechanism for chaining com- 
mands with a new forward-pipe operator, %;¿%. This op- 
erator will forward a value,or the result of an expression, 
into the next function call/expression. There is flexible 
support for the type of right-hand side expressions. 


C. Picked Stocks 


Due to the variation of stock types, only five different stocks 
are picked to represent the different sectors and industries. 
Reputed stocks are selected to inspire student interests. 


AAPL: 

Apple, Inc. engages in the design, manufacture, and 
marketing of mobile communication, media devices, per- 
sonal computers, and portable digital music players. It 
represents the “Consumer Goods” sector in “Electronic 
Equipment” industry. 

CELG: 

Celgene Corp. is an integrated global biopharmaceutical 
company, which engages in the discovery, development 
and commercialization of therapies for the treatment 
of cancer and inflammatory diseases. It represents the 
“Healthcare” sector in “Biotechnology” industry.. 

JPM: 

JPMorgan Chase & Co. operates as a financial services 
company worldwide. It represents the “Financial” sector 
in “Money Center Banks” industry. 

XOM: 

Exxon Mobil Corporation is engaged in energy business. 
The Company is engaged in the exploration, production, 
transportation and sale of crude oil and natural gas, and 
the manufacture, transportation and sale of petroleum 
products. It represents the “Basic Materials” sector in 
“Major Integrated Oil & Gas” industry. 

ABX: 

Barrick Gold Corporation (Barrick) is a gold mining 
company. It represents the “Basic Materials” sector in 
“Gold” industry. 


D. User Interface Design 


Due to the educational goal of the project, the design of 
the trading platform is to fulfill two objectives: 1) to integrate 
enriched content for knowledge delivery, and 2) to be friendly 
to students, who may not be comfortable with busy data 
environment. Thus, four panels are implemented into the user 
interface. 


1) Exploring panel 


The exploring panel (as Figure 3) is designed to demon- 
strate real-time stock information. The content includes 


Fig. 10. Exploring Panel 


real-time or historical stock prices and multiple indica- 
tors, such as BBand, Volumn, RSI and CCI. 


e Different Time Interval Selection: the time interval 
could be for 1 day, 1 week, 1 month, 3 months or 1 
year. The default setting is for 3 months. 

e Zoom In or Zoom Out: the user can zoom in or zoom 
out the plot easily by clicking and draging the area. 

e Buy Points and Sell Points: trading transactions are 
also demonstrated on the plot. Buying transaction 
is shown as a green dot; and selling transaction is 
shown as a red dot. 


2) Control Panel 


Control panel (as Figure 11) is designed to enable 
students to pick fundamental parameter of the trading 
platform. 


Choose a Stock 


AAPL ~ 


Trading Strategy 


Simple E 


Date range for backtesting: 


01/01/16 


10/25/17 


Fig. 11. Control Panel 


The first drop-down menu allows student to pick one of 
the five stocks. The second drop-down menu is for student 
to select one trading strategy. There are three trading 
strategies. The third sub component is optional, which 
is the time range that the user wants to do back-testing. 
The default range is 1/1/2016 to current.However, user 
can extend the starting time to 1/1/2001. 


3) Reporting Panel 
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Reporting panel (as Figure 12) displays the summary 
of back-testing results. It presents the time range of 
back-testing, and gives the number and amount of buy 
and sell transactions. The unrealized earning and total 
earning amount are also demonstrated. Furthermore, a 


Date: 2016-01-01 to 2017-10-25 

# of buy = 61 # of sell = 61 

total Buy = total Sell = 8957.8 
8916.999978 

How many are unrealized? 
40.8000219999983 

What is the earning if | realize the end 
period? 198.210021999998 


Fig. 12. Reporting Panel 


more detailed report is provided for further analysis 
during back-testing. Our system automatically generates 
a .txt file, which contains detail information of every 
transaction. The transaction time and buy/sell prices could 
be used to either verify the accuracy of strategy or to 
evolve into more comprehensive strategy later. 


XI. CONCLUSION 


Our group has taken a survey on data science framework 
and applications. In one of the paper we discuss on eAE 
architecture, framework on big data quality and EDISON 
data science. This paper aims to address about the concept 
of the Ambiguity and quality assurance for data science. 
Next generation data science applications, heart disease, public 
mental health and education are also discussed in detail in this 
paper. We can conclude by saying that given the continuing 
evolution of big data and data science, we note that current 
usage might not show how the industry is evolving, so a 
different, complementary next step may be to rerun an analysis 
of role usage in the industry in the future to identify trends 
over time. 
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Abstract—This paper intends to act as a reference point for 
information related to using technology to promote academic 
success of students with Learning Disabilities (LD) based on 
information retrieved from various sources. 

The major objectives of this study are: 


e To explore the advantages resulting from usage of technol- 
ogy to promote academic success for students with learning 
disabilities; 

e To understand the obstacles that hinder the successful 
implementation and integration of technology that could be 
used to promote academic success for students with learning 
disabilities. 

The methodology used in the following study is a systematic 
literature review of relevant research studies and articles. 

This study found that schools and educations need to execute 
requisite pedagogy in such a manner that students with LD 
are benefited rather than confining them to special instructions 
and objectives. Integrating technology into regular education 
to support students with LD need to be costly or necessitate 
massive training. Students with learning disabilities are ever- 
present in schools today and so is the technology to support 
these students. Assistive technology supports students with LD 
in terms of success in general education and special education 
settings. This paper will discuss the challenges students with LD 
may face in school and the assistive technology educators can use 
to help address these challenges. Assistive technology ranging 
from low-tech devices like pencil grips to advanced software 
like speech recognition software can empower children having 
LD with the skills required to take forward their education in 
an equivalent manner like their peers. Emerging technologies 
like social networking sites and freely available communication 
software can aid effective association with teachers and peers. 

Index Terms—Learning disability, academic performance, as- 
sistive technology 


I. INTRODUCTION 


EARNING disability is a challenge for any school and 
| teacher. If the learning disability is ignored , unnoticed 


and unanswered such children’s needs will not be met in 
regular classes. Identification of a child with learning disability 
is the first step to reduce the learning disability. Then only the 
teachers are able to distinguish them from normal students. 
Learning Disabilities Association of America views of 
“Learning disability is due to genetic and/or neurobiological 
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Anumber of disorders which may affect the 


acquisition, organization, retention, 
understanding or use of verbal or nonverbal 
information. 


May interfere with one or more of the 


Impairments in one or more processes s 
following: 


related to perceiving, thinking, 
remembering or learning: * oral language (e.g. listening, 
speaking, understanding); 
reading (e.g. decoding, phonetic 
knowledge, word recognition, 


comprehension); 


language processing : 
phonological processing 
visual spatial processing 


Perceptual motor integration œ -+ written language (e.g. spelling and 
processing speed written expression); and 
memory and attention * mathematics (e.g. computation, 


problem solving). 
* Organization 
* Social Skills 


executive functions 


Fig. 1. Learning Disabilities 


factors that alter brain functioning in a manner which affects 
one or more cognitive with learning basic skills such as 
reading, writing and math. Technology can help with learning 
disability for challenging in learning especially in the area of 
writing, reading and mathematics. 


Assistive technology serves two major purposes - To aug- 
ment an individual’s strength thereby counterbalancing the 
effect of the disabilities and To provide an alternative mode 
for performing a task. Therefore, tech tools is a assistive 
technology for the students facing learning disabilities. This 
will help to recover the learning disabilities of a student. It 
allows the students to compensate for their disabilities. For 
the learning disabilities students technology is an assistive 
tool replacing an ability that is either missing or impaired. 
It is a support for the student to accomplishing the task. 
The capability of assistive technology in education classes for 
students with LD is enormous. 


Some of the advantages of using technology include pro- 
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moting academic success for students with LD in the field of 
writing, mathematics, spelling, reading, and comprehension, 
enhancing their organizing capability and most importantly 
encouraging their social inclusion in society. Assistive technol- 
ogy offers many advantages by assisting writing for students 
with LD who have find difficult for writing. Once the student 
overcome writing issues they tend to be more successful within 
the general education environment. Many teachers face diffi- 
culty to adapting curriculum as per the learning requirements 
of children with LD without any external assistance. Now, 
this assistive technologies help the teachers to choose most 
appropriate technology for LD students. 


II. LEARNING DISABLITY: CLASSIFICATION 


Learning Disability Association of America has classified 
learning disabilities as follows: 

1) Dyscalculia: It affects the ability to understand arith- 
metic concepts. 

2) Dyslexia : It affects the ability to processing language. 

3) Dysgraphia : It affects the writing ability and motor 
skills. 

4) Non-Verbal Learning Disabilities : This affects verbal 
cues like facial expressions or body language and may have 
poor coordination. 

5) Oral/Written Language Disorder : Learning disabilities 
that affect an individual’s understanding of what they read or 
of spoken language. 


III. LEARNING DISABILITY: SYMPTOMS 


Many children have trouble in reading, writing or perform- 
ing other learning-related tasks at some point. This does not 
mean they have learning disabilities. A child with a learning 
disability often has several related signs, and they don’t go 
away or get better over time. The symptoms of learning 
disability also vary from person to person. 

Some signs and symptoms of learning disability are: 


A. Early Childhood 


Children having difficulty with zippers, buttons, tying shoe 
laces; delayed early language development; problem with 
rhyming words and recognizing the difference similar sound 
words or segmenting words; difficulty with pronunciation; 
difficulty copying from the board or a book; finding it hard 
to find the correct word; trouble learning shapes, days of 
the week, the alphabet and numbers. problem in writing and 
drawing tasks; inconsistent space between letters or words; 
poor understanding in uppercase and lowercase letters or 
similar type of word. 


B. Middle Childhood 


A child may not be able to remember content; due to 
memory problem with money handling; difficulty in merg- 
ing sounds to make words; trouble in telling time; illegible 
handwriting; easy distract in from the sound; omitting or 
not finishing words in sentence; repeating the letter in the 
word(Ibid). Interchanging the letters and numbers in the sen- 
tence; troubling learning new language; 


C. Preadolescent Stage 


Difficulty in organisation; bad handwriting; troubling with 
comprehension reading as well as math concepts; difficulty 
in following class discussion and poor class participation; 
difficulty with reading out loud; difficulty in left and right; 
difficulty with syntax structure and grammar; gap between 
written and understanding; trouble to keep track of thoughts 
already written down. 


IV. SKILLS AFFECTED BY LEARNING DISABILITY 
A. Motor Skills 


It has been observed that some of the children with learning 
disabled lack motor skills. They have poor hand-motor co- 
ordination like holding a pencil or pen in hand is in difficult 
for them, and sometimes they hold them in an inappropriate 
way. They may also find it difficult to perform the mechanical 
operation as they may not be able to hold the equipments 


properly. 
B. Spatial Ability 


Children with learning difficulty may have difficulty in 
understanding the directions and locating new places. They 
may also lack spatial ability which may hamper their chances 
of being good architects, and areas of directionality (left, 
right, front and back) are a weakness among the children with 
learning disability (Ibid). 


C. Dealing with Mathematics 


Children with learning disabled may also be good in oral 
maths but due to poor reading skills they may not be so good 
in written maths. They might face problems in dealing with 
descriptive mathematics, engineering, chemistry or physics 
problems that rely on written text rather than numbers or 
formulas. 


D. Dealing with Creativity 


The Children with learning disabled may also be creative. 
There is no child who does not possess one or the other ability. 
Instead of rebuking and scolding the child for not performing 
like others, his ability should be identified, recognized and 
they could be given an opportunity to develop 


E. Problems in Memory 


Memory involves visual memory as well as auditory mem- 
ory. Children with learning Disability are weak in both types 
of memories. They cannot repeat what they see or hear. This 
memory disorder make their learning process is very slow. 
They get easily distracted due to their very poor memory . 


V. IDENTIFICATION OF LEARNING DISABILITY 


Identification process is not easy in detecting learning 
disability, especially as they can go undetected in early stages 
of childhood and even at the stage of schooling. 

Learning disability could be diagnosed during the elemen- 
tary school stage itself. A child may be diagnosed to have 
problem and finds it difficult to while facing in learning 
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grammar and syntax or to read longer and more complex 
material. More significant is the successful outcomes of early 
intervention than the one that is initiated later. Early interven- 
tion pre-supposes early identification. At present, there is little 
knowledge about the universal screening procedure to guide 
referrals for schools. 

Sarva Shiksha Abhiyan provides a manual with checklist 
for learning disability which acts as a helpful tool for initial 
screening by teachers in the schools however, at present, the 
assessment itself is being used as a screening/identification 
procedure. The children are referred for assessment by the 
teacher/school for reasons of failure, underachievement or 
behavioural problems for the same reasons, parents may take 
the child directly, and avail of examination concession that 
exists children with learning disabilities. In rural areas, there 
is low awareness of Learning Disability and practically less 
assessment facilities. 

National Policy Education ensure equitable access and op- 
portunities for all students with disability Assessment and 
Certificate agencies, including the proposal for new National 
Assessment Centre, PARAKH for which will guidelines are 
formulated recommends appropriate tools for conducting such 
assessment, from the foundational stage to higher education. 


VI. TECH TOOLS FOR STUDENTS WITH LEARNING 
DISABILITIES 
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Fig. 2. Assistive Technologies 


Technology can help students with Learning disabilities for 
challenging in learning, for that provides computer supported 
tools for compensate their disabilities. The strength of assistive 
technology for the students with learning disabilities is great. 
It improve the academic achievement in written expression, 
reading, mathematics and spelling. It’s also help to improve the 
organization and fostering social acceptance. The necessary 
component of this effort is collaboration between classroom 
teachers and assistive technology specialists. The use of tech- 
nology must be a collaborative effort. Assistive technology 
devices is any piece of equipment, or product system that used 
to increase, maintaining, or improve functional capabilities of 
individuals with disabilities Assistive technology serves two 
major purposes: 


1) To augment an individual’s strength thereby counterbal- 
ancing the effect of the disabilities. 
2) To provide an alternative mode for performing a task. 


Therefore, tech tools is a assistive technology for the 
students facing learning disabilities. This will help to recover 
the learning disabilities of a student .It allows the students to 
compensate for their disabilities. 


VII. READING DISABILITY 


Students with Reading Disabilities are struggling to catch up 
with their peers that does not have Learning Disability. It is due 
to their inability to gain information and knowledge through 
the traditional print. The students with learning disability are 
more likely to drop out of school when compared to those 
students without learning disability. There are some assistive 
technology that falls under this: 

1) Text-to-speech: Text-to-speech is software that provides 
a computer synthesized speech to read digital text. The use 
of text-to-speech allows students to access text visually (i.e., 
reading as well as seeing the words highlighted) and hearing 
it read out aloud. Many text-to-speech programs allow for 
customization to meet students’ needs or desires. Text-to- 
speech can be used on computers, personal computing devices, 
or handheld devices, or e-reader . Further, text-to-speech 
programs can be acquired both for free and for cost. high 
school students with LD improve their reading fluency and 
comprehension of material found that pairing text-to-speech 
capabilities along with other software increased seventh- and 
eighth-grade students’ overall reading ability. 

2) Accessible Text : Accessible text is text that can be 
either manipulated or transformed into another format other 
than traditional print. accessible text is commonly called 
digital text or etext for students with LD. It can be used with 
personal devices, software allows digital text to be imported 
into MP3 files and read aloud on such players as an iPod. 

3) Supported eText : Supported eText is digital or elec- 
tronic text supplemented with rewording, description, media, 
highlighted text, or other strategies to increase comprehension 
of material. Use of supported eText has an emerging research 
base; supported eText improved the reading comprehension 
achievement of high school students with mild-to-moderate 
disabilities , and, in some cases, aided in comprehension 
related to functional skills for students with intellectual dis- 
abilities. 


VIII. WRITING DISABILITY 


Students with learning disabilities tend to have deficits in 
various academic areas and writing is one of them. Students 
with learning disabilities typically have slow writing speeds 
and are prone to spelling mistakes and have lack of sentence 
structure and grammar.These students may face problems in 
future.E-mail is used to communicate in work place, If an 
individual cannot communicate through the writing, she/ he 
will be at a disadvantage throughout their life. 
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1) Word Processing : Computer-based accommodations for 
dyslexia may not require specialized hardware or software. For 
example, a person with dyslexia can benefit from regularly 
using built-in word processor features such as: 

e spell checking 

e grammar checking 

e font size and color changes 

The use of spell checkers can allow the person with learning 
difficulties to remain focused on the task of communication, 
rather than getting bogged down in the process of trying 
unsuccessfully to identify and correct spelling errors. Many 
word processing programs also include tools for outlining 
thoughts and providing alternative visual formats that may 
compensate for difficulty in organizing words and ideas. 
Additionally, color-coded text options and outline capabilities 
present in many word processing programs are useful tools 
for those with difficulty sorting and sequencing thoughts and 
ideas. A word processor can also be used as a compensatory 
tool for a person with dysgraphia. Use of a keyboard may 
be a viable alternative for an individual who has difficulty 
expressing his thoughts. 

2) Word Prediction: Spelling words correctly while typing 
can be a challenge for some people with dyslexia. Word 
prediction programs helps the user with a list of most likely 
word choices based upon what has been typed so far. Rather 
than remembering the spelling of a word, he can refer to 
the predictive list, choose the desired word and continue with 
writing. 

3) Graphic Organizers and Planning Tools: Struggling 
writers often skip important steps, such as the planning, revis- 
ing, and editing stages during the writing process . Graphic 
organizers can be utilized in paper, software, or online formats 
to help in the planning and revision of writing content. Graphic 
organizers allow students to visually represent their ideas or 
text structure. studied use of a graphic organizer software helps 
with students with mild disabilities to plan and organize their 
writing. Another tool to help in organization and planning is 
SOLO, a combination of assistive technology that supports 
reading and writing. It contains word prediction software, 
talking word processing software, graphic organization and 
text reader, which are all assistive technology tools to help 
with the writing process. 


IX. MATHEMATICS 


Technology to support mathematics for students with LD 
includes not only that which can support access to the math- 
ematics (e.g., literacy tools) but also tools to support both 
computation and problem-solving. Mathematics-specific tech- 
nologies for students with LD include the calculator (1.e., four- 
function, scientifific, or graphing), manipulatives, computer- 
assisted instruction (CAI), and anchored instruction. 

1) Calculators: Researchers found that students with and 
without disabilities who used calculators on multiple-choice 
and open-ended problems were more likely to answer ques- 
tions correctly. Further, a strong research base from the field 
of general education exists to support calculator use with 


students, such as improved problem-solving skills, improved 
conceptual understanding, and no negative consequences to 
skill development. In addition, calculators can also lighten the 
cognitive load for students with LD. 

2) Manipulatives: The use of concrete manipulatives is a 
best practice for mathematics education of secondary students 
with disabilities. Researchers found use of concrete manipu- 
latives resulted in students successfully solving mathematics 
word problems. Concrete manipulatives can support students 
in learning through a hands-one experience and can be used 
to assist students with LD in developing conceptual under- 
standing through progressing from concrete (i.e., physical, 
concrete manipulative, such as tiles or fraction strips) to 
semiconcrete (i.e., pictorial representations) to abstract (i.e., 
symbols, such as numbers and operations), known as the 
concrete—semiconcrete—abstract (CSA) approach. Beyond con- 
crete manipulatives, teachers can consider using virtual ma- 
nipulatives which are typically virtual replicas of the concrete 
objects or, in other words, interactive online tools. While little 
research exists on virtual manipulatives to support students 
with LD in mathematics, this technology remains an option, 
particularly for teachers who may have limited access to 
concrete manipulatives. 

3) Computer-Assisted Instruction: CAI is another assistive 
technology to support students with LD in mathematics. CAI is 
defined as “computer programs that provide drill-and practice, 
tutorial, or simulation activities offered either by themselves 
or as supplements to traditional, teacher directed instruction”. 
Although researchers studying students with high incidence 
disabilities — including students with LD — found CAI offered 
benefits to students in terms of computation and problem- 
solving, it is found that CAI and mathematics for students with 
LD reported the effectiveness of CAI for mathematics perfor- 
mance. It is primarily focused on elementary-aged students 
and basic facts (e.g., addition or multiplication;). Regardless, if 
teachers use CAI to support students with LD in mathematics, 
they should seek programs that particularly target students 
with LD or include evidence-based practices for teaching this 
population. 

4) Anchored Instruction: The final mathematics technology 
for students with LD is anchored instruction (i.e., video 
presentation of an applied mathematics lesson or problems. 
Research on anchored instruction shows students with LD im- 
proved in their mathematical problem-solving after engaging 
with the anchored instruction. Anchored instruction allows 
students with LD to access and engage in problem-solving 
skills that extend beyond computation. 


X. ORGANIZATION AND SELF-MANAGEMENT OF 
ASSISTIVE TECHNOLOGY 


It can also assist students with LD in other areas of school 
and postschool lives. It can assist students with learning 
disabilities in organization and memory. 

Students with learning disabilities can use handheld devices 
like smartphones, PDA, iPad, smart Phones etc to assist with 
organization and memory. It helps in note taking with audio 
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recording, reminders, calendar features etc.Some researchers 
has developed a software called Kid Tool Support System 
(KTSS). Programs within the KTSS support and teach students 
regulation, problem solving, organization etc. 

Older form can also be used to support students for organi- 
zation and memory, personal data manager and Watch-Minder 
can be used for this. The WatchMinder is a sports watch that 
can be programmed to provide 30 alarms in addition to 65 
preprogrammed messages. Students can be reminded not only 
about taking medication, but also to turn in an assignment 
or study for a test as well as simply being prompted to pay 
attention. 


XI. TEACHERS’ PROFESSIONALISM 


This study is aimed at investigating teachers’ beliefs and 
their professionalism regarding the use of assistive technolo- 
gies (AT) in teaching children with specific learning disabili- 
ties (SLD). 

1) Materials and methods: 

To achieve the study purpose, the researchers developed a 
scale, ‘teachers’ beliefs and professionalism’, consisting 
of four subscales. A random sample of 157 SLD teachers 
participated in the study by completing the study scale 
and fifteen teachers were later interviewed. 

2) Results: 

The SLD teachers’ self-reported use of AT in curriculum 
of children with SLD was high. The teacher’s perception 
of their professionalism in using assistive technologies 
in the teaching process sub-scale had the highest mean, 
whereas the availability of AT had the lowest. Results 
revealed a statistically significant correlation between 
teachers’ beliefs and professionalism. The results also 
revealed that there were no significant differences be- 
tween SLD teachers according to the teachers’ gender 
or experience level, or the level of child disability. The 
results showed that there was only difference regarding 
the availability of AT sub-scale, pertaining to public 
school and private school and in favour of private schools. 


XII. IMPLICATIONS FOR REHABILITATION 


1) There is a vital need to investigate teacher’s profession- 
alism and beliefs regarding applying AT for the children 
with SLD in inclusion settings, especially in developing 
countries 

2) The availability of AT sub-scale had the lowest mean. 

3) The teachers’ perceptions of their professionalism in 
using AT in teaching had the highest mean. 

4) It is hoped that this study provide the decision-makers in 
the Ministry of Education (MoE) with valuable insights 
to develop the use of AT in teaching reading and writing 
for the children with SLD, as well as to develop their 
capacities to play a crucial role to develop a new appro- 
priate training technique for the teaches to acquire the 
skills in order to enrich the using of AT to enhance the 
children abilities to develop their mental, social abilities 
in inclusive schools. 


5) Provide training for teachers and the teams who work 
with children with SLD to match particular technologies 
to specific needs to help the children with SLD to be 
more independent. 

6) Future studies should be done to get a complete picture 
about the role of AT in teaching children with SLD as 
perceived by teachers, principals, and parents. As well as 
to investigate the effectiveness of using AT in developing 
children skills with reading and writing difficulties to mo- 
tivate schools in enhancing independence of the children. 
Further studies should also be conducted to compare the 
instructional practices in the field of AT used in Jorda- 
nian inclusive schools and schools applying international 
programmes to benefit from their instructional practices 
and their effective use of AT with children with SLD in 
inclusive schools 


XIII. TEACHERS’ SKILLS 


Regarding the teachers’ perceptions about their skills of 
professionalism in using technology to teach children with 
SLD, the items which had the highest mean values were: 
teachers strongly agreed that they had the ability to plan IEP 
using technology tools they could take into consideration the 
individual differences between children with SLD while using 
AT tools and they had the necessary skills to employ, and 
benefit from, technology tools in management of the classroom 
environment. Of the items had the lowest teachers strongly 
agreed that they could manage programmes and services 
provided by AT tools with children with SLD; that they had 
the skill to use supporting hardware and software in the EI 
programmes; and that they could use technology in the process 
of assessing and diagnosing children’s difficulties. 


XIV. ROLE OF SCHOOL 


Most classrooms have children with learning disabilities 
who need continuous support. Teachers must help them by 
identifying using screening procedures of learning disability 
to mitigate plan learning disability. Specific action should be 
included to use appropriate technology allowing and enabling 
children to work at their own pace. Schools may also pro- 
vide resources for the integration of children with disabilities 
requirement with special educators and training and resource 
centers. Different categories of children with disabilities have 
different needs. School and School complex should work 
and extend support like accommodate and modification with 
support mechanisms tailored to suit their needs of children 
with disabilities ensure their full participation. 


XV. CONCLUSION 


Technology can help students with LD compensate for chal- 
lenges in learning, especially in the area of writing, providing 
computer-supported tools. In addition, this technology can also 
ease frustration, increase motivation,and a sense of peer accep- 
tance, and improve productivity in the classroom and at home. 
The IDEA amendments specify that assistive technology be 
considered in developing individualized educational plans. 
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Collaborative planning teams must develop a vision of 
technology for individual students and general education class- 
rooms. Team members need to determine the effectiveness of 
current technology and closely monitor students to ensure that 
the necessary modifications are made to reflect the changing 
abilities of the individuals. 

The potential of assistive technology for students has not 
been realized; the future is uncertain but holds much promise. 
For individuals with disabilities, this technology can be one 
way to break down barriers to learning. The use of AT for 
children with LD who are learning in inclusive schools holds 
great promise. 

Research shows that improvements in reading, writing, 
spelling and maths difficulties are possible. This research 
concluded that teachers of children with LD who learn in 
inclusive schools believe in the importance of employing 
and integrating AT in the teaching process. However, it was 
revealed that participants use simple AT tools, so it can be 
concluded that increased training and resource availability 
would prompt further implementation of AT. 
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Abstract—Cyber attacks are fast moving and increasing in 
number and severity. When the attacks occur, the attacked 
enterprise responds with a collection of predetermined actions. 
Applying digital forensics helps in the recovery and investigation 
of material on digital media and networks is one of these actions. 
Cyber Forensic Investigation includes the Capture Analysis of 
digital data either to prove or disprove whether the internet 
related theft has committed or not.Cyber Forensics is fairly 
new as a scientific discipline and deals with the acquisition, 
authentication and analysis of digital evidence. One of the biggest 
challenges in this domain has thus far been real data sources 
that are available. In this paper we present how social media 
data sources may impact future directions in cyber forensics, 
and describe how these data sources may be used as new digital 
forensic artifacts in future investigations for experimentation and 
also the tools used for this kind of investigation. 

Index Terms—Digital forensics, cyber forensics (CF), network 
forensics, social media data sources. 


I. INTRODUCTION 


LL digital devices such as cell phones, tablets, laptops 
and desktop computers can be used for criminal activi- 
ties such as fraud, drug trafficking, homicide, hacking, 
forgery, terrorism, etc. To fight against these criminal activities, 
digital forensics is used to help investigate cybercrimes and to 
identify the device-assisted crime and the authors of it. 
There are many definitions of digital forensics but, the one 
that describe it properly is “Digital forensics is the discipline 
that combines elements of law and computer science to collect 
and analyze data from computer systems, networks, wireless 
communications, and storage devices in a way that is admis- 
sible as evidence in a court of law” (National Cyber security 
and Communications Integration Center — NCCIC). It involves 
applying computer investigation and analysis techniques to 
solve a crime and provide evidence to support a case. It is the 
process of identifying, preserving, analyzing and presenting 
the digital evidence in such a manner that the evidences are 
legally acceptable. 
Exchange of information on the internet is convenient for us. 
But it gives opportunity for criminals like Phishing, corporate 
fraud, intellectual property disputes, theft, and breach of 
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Mobile 


contract and asset recovery. In this fields computer forensics 
can be applied. Advantage is the ability to search and analyze 
a large amount of information quickly and efficiently and 
to identify key pieces of data that can be used to assist in 
the formation of a legal case. By using cyber forensic tools 
it is very easy to probe the evidence. It involves various 
applications like analyzing the quality of food and predicting 
the fire disasters etc. Nowadays, digital evidence has become 
of paramount importance. Subsequently, forensic sciences ex- 
tended their scope to include digital evidence, thus, a new 
domain was born — Cyber Forensics (CF). 


II. TYPES OF DIGITAL FORENSICS 
Digital forensics is of mainly five types. They are 
1) Computer forensics 
2) Mobile device forensics 
3) Database forensics 
4) Network forensics 
5) Forensic data analysis 


III. NETWORK FORENSICS 


Network forensics is a sub-branch of digital forensics relat- 
ing to the monitoring and analysis of computer network traffic 
for the purposes of information gathering, legal evidence, or 
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intrusion detection. Unlike other areas of digital forensics, 
network investigations deal with volatile and dynamic infor- 
mation. Network traffic is transmitted and then lost, so network 
forensics is often a pro-active investigation. 

Network forensics generally has two uses. The first, relating 
to security, involves monitoring a network for anomalous 
traffic and identifying intrusions. An attacker might be able 
to erase all log files on a compromised host; network-based 
evidence might therefore be the only evidence available for 
forensic analysis. The second form relates to law enforcement. 
In this case analysis of captured network traffic can include 
tasks such as reassembling transferred files, searching for 
keywords and parsing human communication such as emails 
or chat sessions. 


IV. PHASES OF CYBER FORENSICS 


Figure 2 shows the various phases of cyver forensics. We 
consider each of them in turn. 


A. Identification Phase 


This is a process of identifying evidence material and its 
probable location. Basic requirement in evidence collection is 
evidence must be presented without alteration. At the time 
of evidence collection, there is a need of thorough check 
of system logs, time stamps and security monitors. Once 
evidence collected, investigators would need detailed forensics 
to establish a chain of custody. Chain of custody is a vital 
part of computer forensics and the legal system and goal is 
to protect the integrity of evidence, so evidence should be 
physically secured in a safe place along with a detailed log. 


B. Acquisition Phase 


This phase saves the state of evidence that can be further 
analyzed. The goal of this phase is to save all digital values. 
Here a copy of hard disk is created, which is commonly called 
as an image. Three types of commonly accepted forensics 
acquisition are mirror image, forensics duplication and live 


acquisition. Mirror Image is a bit-for-bit copy, involves the 
backups of entire hard disk. Forensics Duplication is sector- 
by-sector and an advanced method that makes a copy of every 
bit without leaving any single bit of the evidence. It is a most 
common type of acquisition because it creates a forensic image 
of the e-evidence and it also contains file slack. It uses tools 
like Forensic Tool Kit (FTK) imager, UNIX dd command, 
or Encase. Forensic Tool Kit (FTK) has the ability to identify 
steganography and practice of camouflaging data in plain sight. 
Live Acquisition refers to the acquisition of a machine that is 
still running and can retrieve both static and dynamic, volatile 
data. 


C. Analysis Phase 


This is a process of understanding, re-creating, and analyz- 
ing arbitrary events that have gathered from digital sources. 
The analysis phase collects the acquired data and examines it 
to find the pieces of evidences. This phase also identify that 
the system was tampered or not to avoid identification. Three 
types of examinations can be applied for the forensics analyses 
are limited, partial or full examination. Limited Examination 
covers the data areas that are specified by legal documents or 
based on interviews. This examination process is least time 
consuming and most common type. Partial Examination deals 
with prominent areas. Key areas like log files, registry, cookies, 
E-mail folders and user directories etc., are examined. It based 
on general search criteria which are developed by forensic 
experts. Full Examination is most time consuming and less 
frequent examination process. It requires the examiner to look 
each and every possible bit of data to find the root causes of 
the incident. It uses tools like Coroner, Encase, and FTK. The 
Coroner toolkit run under UNIX and Encase is a toolkit that 
runs under Windows. Encase has the ability to process larger 
amounts and allow the user to use predefined scripts to pull 
information from the data being processed. FTK contains a 
variety of separate tools (text indexing, NAT recovery, data 
extraction, file filtering, E-mail recovery etc.,) to assist in the 
examination. 


D. Reporting Phase 


This phase comprises of documentation and evidence re- 
tention. Scientific method is used in this phase to draw 
conclusions based on the gathered evidence. Based on the 
Cyber laws and presents the conclusions for corresponding 
evidence from investigation. Factors to be considered in this 
process are prosecution, data retention and cost. To meet the 
requirements there is a need of maintaining log archival. The 
archived logs must be protected to maintain confidentiality and 
integrity of logs. 


V. ADVANCING CYBER FORENSICS: WHAT THE SOCIAL 
WORLD HAS TO OFFER 


Cyber Forensics is fairly new as a scientific discipline 
and deals with the acquisition, authentication and analysis of 
digital evidence. One of the biggest challenges in this domain 
has thus far been real data sources that are available. In this 
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TABLE I 
CF: OPEN SOURCE TOOLS AND PROPRIETARY TOOLS 


TABLE II 
CF : WHAT THE SOCIAL WORLD HAS TO OFFER 


paper present how social media data sources may impact 
future directions in cyber forensics, and describe how these 
data sources may be used as new digital forensic artifact’s 
in future investigations for experimentation. Missing real data 
sources is a serious problem across different areas in computer 
science. Most agencies, vendors, providers have to keep their 
data secure and private. One cannot ignore the issue that a 
training set is needed in machine learning and an appropriate 
training set has to come from real cases in CF. 


A. Major challenges in Cyber Forensics 
Below we list some of the major challenges in CF. 


1) The lack of real data sources. 
2) The young and ever changing nature of the field. 
3) The dependency on tools. 
4) The lack of published error rates for the various widely 
used digital forensics tools. 
5) The lack of basic research in this domain. 
6) The lack of agreed upon standards and processes. 
7) The limitation of the hardware standards being used 
during the acquisition of data. 
8) The volatility of the evidence — such as RAM. 
9) The continuous change in technology. 
10) The use of anti-forensics techniques and tools. 
11) The lack of a common body of knowledge. 


B. Social Media: New Digital Forensic Artifacts 


With the rise of social media applications on a multitude 
of platforms comes the potential for these applications to 
leave behind digital forensic artifact’s that may be integral 
to an investigation. For example, research has shown how to 
extract Facebook chat logs from disks and the vast amount 
of digital artifact’s mobile social applications leave behind - 
such as usernames, passwords, chat messages, posts, friends, 
location data and pictures. Digital forensic artifact’s that could 
be extracted from social media applications are critical sources 
of digital evidence. 


Product name Purpose / Platform License Social Data Applicability to CF Computation 
A ; Source 
Internet Evidence | Search a hard drive / $ 
Finder Windows Commercial Computational 
à Text Posts Author attribution SIDA 
p ; Linguistics 
History + social net- 
CacheBack working sites analysis | Commercial Facial and Object : 
. Images YA Image Processing 
/Windows recognition 
Network packet Binding“: loeation-OF Geographic Informa- 
Wireshark capturing /Windows, | Open Source (GPL) Geolocation Data 5 tion Systems + Pro- 
: a suspect é 
Linux, Max OS gramming 
Intercept Videos Facial and Object imase SAA. AA 
TcpDump TCP/IP/Windows, Open Source (GPL) i recognition g eames 
a E Language Used Cyber profilin Sones ae 
view files that are in guage yper p g chology 
different format and 
File Viewers to save or export | Open Source 
them in different for- : : . 
mat/Windows C. Social Media: New Public Data Sources 


The other non-intuitive source of data for CF research the 
social world has to offer is publically available. Publically 
available social media posts that include data such as geolo- 
cation, unstructured text and multimedia files are of critical 
importance for advancing the CF domain. Following table 
shows some ideas of how these publically accessible data 
sources may be leveraged during a CF investigation. 


VI. EVIDENCE COLLECTION: ROLE AND USE OF SOCIAL 
MEDIA EVIDENCE 


Social media evidence must be collected by a legally and 
scientifically appropriate forensic process and also coincide 
with the privacy rights of individuals. Social media evidence 
provides an unlimited source of information about a potential 
suspect’s or victim’s profile that can be mined in close-to-real- 
time. The metadata (information accompanying by content) 
and network a data hold the adequate potential to assist in 
criminal investigations and to authenticate the evidence from 
online social networks (OSNs). Presently, it is legal require- 
ment in a substantial number of serious crime investigations to 
seize and examine the digital devices of victims and suspects. 
The data on these devices help to find traces of crime or history 
of digital activities, performed by the user. The use of social 
media as evidence is quite common in criminal cases. Several 
criminal cases are now routinely investigated, prosecuted and 
defended through social media as evidence. Below show the 
aspects of the role of social media evidence. 


1) Documented communications are used to access a per- 
son’s state of mind. 

2) Recorded of daily online activities are used as evidence 
of presence or absence at a specific time or place. 

3) Photographs showing living style provides proof of 
spending or income and physical health. 

4) Photographs provide the evidence of where and with 
whom a person spends time with. 

5) Online behavior provides traces of cybercrime like cyber 
bullying, cyber-harassment or cyber. 

6) Predating. 
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7) Online profiles offer evidence of impersonation and iden- 
tity theft. 

8) OSN profiles are used for background checks on potential 
suspects and witnesses. 


VII. EVIDENCE COLLECTION: FORENSIC ACQUISITION OF 
SOCIAL MEDIA CONTENT 


Forensic artifacts are recognized as a critical source of 
evidence on social media. Hence most of the research efforts 
are focused on forensic evidence acquisition. The requirements 
for forensic collection from social media are generally outlined 
as 


1) Collecting the relevant data or content from multiple 
social media sites. 

2) Collecting metadata with social media content. 

3) Ensure the integrity of data in the forensic collection 
process. 


The forensic acquisition of social media data through device 
forensics suffers the limitation of retrieving partial data. These 
applications are utilized to access OSNs on mobile devices. 
However, these handheld devices are not designed to save an 
entire copy of social media on storage. Few commercial tools, 
i.e., CacheBack, Internet Evidence Finder (IEF) and EnCase 
Forensic are also used with limited success to retrieve social 
media forensic artifacts from browser history and databases. 


VIII. CF: OPEN SOURCE TOOLS AND PROPRIETARY 
TOOLS WITH COMPARISON 


A. Tools 


Open source tools is a phrase used to mean a program or 
a tool that performs an extremely particular assignment, in 
which the source code is candidly distributed for use and /or 
modification from its original design, free of charge. They are 
typically created as a collaborative effort in which program- 
mers improve upon the code and share the changes within 
the community, and it is usually available at no charge under 
a license defined by the Open source initiative. Proprietary 
tools are those tools or programs that are charged by the 
vendors. These software’s have restriction on any blending of 
the utilization, adjustment, replicating or circulating changed 
forms of the product. They may also be called as Closed 
Source- Software. 


1) Internet Evidence Finder (IEF): 
IEF is a software application that can search a hard 
drive or files for Internet related artifacts. It is a data 
recovery tool that is geared towards digital forensics 
examinersJJAD Software, 2011a). Platform and licence 
are Windows and Commercial respectively. 

2) CacheBack: 
The purpose is Internet cache and history analysis + 
social networking sites analysis. Platform and licence are 
Windows and Commercial respectively. 

3) Wireshark: 
Wireshark is a network protocol analyses, which allows 
capturing and interactively browsing the traffic running 


TABLE III 


CF : COMPARISON OF OPEN SOURCE TOOLS AND PROPRIETARY TOOLS 


Features 


Open Source 


Closed Source 


Price Policy 


Free of cost software. 


Paid software. 


Code - viewed, shared 


Software can be fixed 


Security and modified by the 


community. only by a vendor. 


Quality of Sup- | Option to contact sup- | Options are forums, 


port port. useful articles etc. 
Source Code 
Availability Can be change. Cannot be changed . 
Documentation is 
Usabilit User guides are writ- | well-written and 
: y ten for developers. contains detailed 


instructions. 


on a computer network and some of its network packet 
capturing capability can be used for digital forensics 
investigation. Platforms are Windows,Linux, Max OS and 
licence is Open Source (GPL). 

4) TcpDump: 
It is command line based network packet Analyzer. 
Allows digital forensic investigator to intercept TCP/IP 
and other Packet transfer information. Windows,Linux, 
MaxOS are platforms the tool used.Licence is Open 
Source (GPL). 

5) File Viewers: 
As the cybercriminal engaged in their nefarious act they 
also try to save the collected data in to different file 
formats so that makes it difficult. File viewers are tools 
used to view those type files that are in different format 
and to save or export them in different format like bkf 
files, Lotus Notes DXL, E01 files within EDB, PST, OST, 
MSG, PPT, Visio diagrams and multimedia like dvd, cd 
audio vcd. Platform is Windows and licence is Open 
Source. 


B. Comaprison 


We can compare these open and closed tools based on 
their features.In this paper we includes the features like Price 
Policy,Security , Quality of Support , Source Code Availability 
and usability.Below table shows the comparison between open 
source tool and closed source tool. 

1) Price Policy: First feature is their price policy.In open 
source tools it is for free of cost software. It can have costs 
for extras like assistance, additional services or added func- 
tionality.In closed source tools , it contain Paid software. The 
costs can vary depending on the complexity of the software. 

2) Security: In comparison security is one of the important 
feature.In open source tools , the code of open source software 
can be viewed, shared and modified by the community, which 
means anyone can fix , upgrade and test the broken code. The 
source code is open for hackers to practice on.In closed source 
tools , Software can be fixed only by a vendor. If something 
goes wrong with the software, you send a request and wait 
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for the answer from the support team.Other one is Quality 
of Support.In open source tools , the costs for it include an 
option to contact support and get it in one business day in 
most cases. The response is well organized and documented.In 
closed source tools , the only support options are forums, 
useful articles, and a hired expert. However, it is not surprising 
that using such kind of service you will not receive a high level 
of response. 

3) Source Code Availability: Source Code Availability is 
anothe feature.In open source tools , provides an ability to 
change the source code without any restrictions. As the source 
code is easily .In closed source tools , more restricted than 
open source software because the source code cannot be 
changed or viewed. However, such limitation is what may 
contribute to CSS security and reliability. 

4) Utility: The last feature we discussed in this paper is 
Usability.In open source tools, Usability is a painful subject 
of open source software. User guides are written for developers 
rather than to layperson users. Also, these manuals are failing 
to conform to the standards and structure.For closed source 
software usability is one of the merits. Documentation is 
usually well-written and contains detailed instructions. 


IX. CONCLUSION 


The field of digital forensics has become popular over the 
last few years as both the computer and the cellular market 
has expanded. With the increasing use of digital data and 
mobile phones, cyber forensics has become more prominent, 
even Cyber thefts are also increasing as day advances. This 
paper helps to show few existing and popular digital forensics 
tools used. This field will enable crucial electronic evidence 
to be found, whether it was lost, deleted, damaged, or hidden, 
and used to prosecute individuals that believe they have 
successfully beaten the system. 

Digital evidence can also be obtained from the data structure 
locate in memory by using different tools. The new process 
model is opted to collect crucial evidence quickly and in- 
vestigate the cases immediately. Network packet analyzer is 
opted for network troubleshooting analysis, advancement of 
communication protocol, and also in education. It observes 
network traffic and identified high level of traffic in our 
network. In forensic analysis, the sophisticated forensic tools 
are not only required to collect and analyze data, but are also 
needed to resolve any ambiguity or conflicts introduced due 
to their execution. This research provides a provisional study 
of the tools regarding cyber forensic analysis on social media. 

We described various computer forensics related definitions 
and phases of cyber forensics and forensics methodology. The 
various phases of Cyber forensics have been discussed and 
each phase explored with their respective tools. It still evolves 
and will remain a hot topic as long as there are ways to threaten 


data security. Finally we had shown the current research trends 
in this new era of cyber forensics. 

In this research paper we have mentioned about the various 
types of forensic tools that can be used for solving digital 
crimes. In some cases, the tools are software-based, but at 
times hardware are also required to acquire evidences. Some 
of the software’s are freely available and some software’s 
are paid. Freely available software’s are also known as Open 
Source tools. A comprehensive list of these tools with their 
download link, use and the platforms like Windows, UNIX, 
Linux, DOS, and MAC etc. on which they can function 
accordingly has been mentioned. The paid tools are also 
known as closed source tools or Proprietary tools. These tools 
have more features compared to open source tools and are 
highly expensive. 
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Abstract—Big data is the term given to a dataset when the 
quantity of data within the system exceeds that which can 
be managed by a single processor . In such a context, it is 
necessary to store and analyze the data using specialized big 
data techniques. A number of big data methods are available, 
but in the study the emphasis will be placed upon the use of the 
widely used MapReduce and YARN MapReduce approaches. At 
first, the gathered data will be stored before analysis through 
an HDFS storage system. The use of an HDFS storage system 
means the data storage is performed in the form of blocks, which 
can subsequently be divided into various different block clusters. 
This paper presents big data analytics in various places. They 
are public health care, smart agriculture etc. 

Index Terms—Post performance, artificial neural network 
(ANN), K-nearest neighbours (KNN), decision tree, simple linear 
regression. 


I. INTRODUCTION 


IG DATA is a field that treats ways to analyze, sys- 
B tematically extract information from, or otherwise deal 


with data sets that are too large or complex to be 
dealt with by traditional data-processing application software. 
Data with many fields (columns) offer greater statistical 
power, while data with higher complexity (more attributes 
or columns) may lead to a higher false discovery rate. Big 
data analysis challenges include capturing data, data storage, 
data analysis, search, sharing, transfer, visualization, querying, 
updating, information privacy, and data source. Big data was 
originally associated with three key concepts: volume, variety, 
and velocity. The analysis of big data presents challenges 
in sampling, and thus previously only allowed for observa- 
tions and sampling. Therefore, big data often includes data 
with sizes that exceed the capacity of traditional software to 
process within an acceptable time and value.Current usage 
of the term big data tends to refer to the use of predictive 
analytics, user behavior analytics, or certain other advanced 
data analytics methods that extract value from big data, and 
seldom to a particular size of data set. "There is little doubt 
that the quantities of data now available are indeed large, but 
that’s not the most relevant characteristic of this new data 
ecosystem.” Analysis of data sets can find new correlations 
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Fig. 1. Big social data 


to "spot business trends, prevent diseases, combat crime and 
so on”. Scientists, business executives, medical practitioners, 
advertising and governments alike regularly meet difficulties 
with large data(email: email id) sets in areas including Internet 
searches, fintech, healthcare analytics, geographic information 
systems, urban informatics, and business informatics. Scien- 
tists encounter limitations in eScience work, including meteo- 
rology, genomics,connectomics, complex physics simulations, 
biology, and environmental research.This paper is organized as 
follows:section A presents the Dimensions of Big Data,section 
B presents the public health care,while section C presents Big 
data in agriculture. 


II. DIMENSIONS OF BIG DATA 


Big Data is characterized by some Vs word referred 
to as its dimensions. Initially,three of these words 
(volume,velocity,variety),were known to be the main 
ones. However, more and more have been added,then they 
were five main words,and later ten which is why today we 
speak of the 10 V’s of Big Data or Dimensions of Big Data. 
Nowadays, Big Data can be defined in several different ways. 
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Big Data refers to a volume of data in order of exabytes 
and much more. All the different definitions of Big Data 
can be summarized in criteria which are also known as the 
Vs of Big Data or Dimensions of Big Data.There are ve- 
locity, volume, variety, veracity, value, venue, validity, variability, 
vagueness, vocabulary. 


e Volume: 
You may have heard on more than one occasion that 
Big Data is nothing more than business intelligence, but 
in a very large format. More data, however, does not 
necessarily mean it is Big Data.Obviously, the Big Data, 
needs a certain amount of data, but having a huge amount 
of data, does not necessarily mean that you are working 
on Big Data.It would also be a mistake to think that all 
areas of Big Data are business intelligence. Big Data is 
not limited or defined by the objectives sought with that 
initiative. But it will be by the characteristics of the data 
itself. 

e Variety: 
Today, we can base our decisions on the prescriptive data 
obtained through Big Data. Thanks to this technology, 
every action of customers, competitors, suppliers, etc, 
will generate prescriptive information that will range from 
structured and easily managed data to unstructured infor- 
mation that is difficult to use for decision making.Each 
piece of data, or core information, will require specific 
treatment. In addition, each type of data will require 
specific storage needs; the storage of an email will be 
much less than that of a video. 

e Veracity: 
This V will refer to both data quality and availabil- 
ity.When it comes to traditional business analytics, the 
source of the data is going to be much smaller in 
both quantity and variety. However, the organization will 
have more control over them, and their veracity will be 
greater.When we talk about Big Data, variety is going to 
mean greater uncertainty about the quality of that data 
and its availability. It will also have its implications in 
terms of the data sources we may have. 

e Velocity: 
It is very possible that Variety and Veracity would not 
be so relevant and would not be so much pressure when 
facing a Big Data initiative if it were not for the high 
Volume of information that has to be handled and, above 
all, for the velocity at which the information has to be 
generated and managed.The data will be an input for the 
technology area it will be essential to be able to store and 
digest large amounts of information. And the output part 
will be the decisions and reactions that will later involve 
the corresponding departments. The important thing here 
is that they are able to react with the necessary speed to 
boost the business area. 

e Variability: 
Variability is different from variety. A coffee shop may 
offer 6 different blends of coffee, but if you get the same 


blend every day and it tastes different every day, that is 
variability. The same is true of data, if the meaning is 
constantly changing it can have a huge impact on your 
data homogenization. 


III. PUBLIC HEALTH CARE 


To improve the quality of healthcare, Big Data Analytics are 
required in healthcare for providing quick services to patients 
in order to get relief at faster rates by reducing the over usage 
of drugs or medicine. 

The healthcare data can be analyzed with Big Data Ana- 
lytics to predict patient diseases and to suggest suitable drugs 
needed. The real-time data for future applications in detecting 
hazardous diseases and infections are predicted as soon as 
possible. These predictions will help the hospital outbreaks and 
reduce patient morbidity. The real time data acquired from ICU 
monitors and other types of equipment rectify life-threatening 
infections as fast as possible. Such high voluminous healthcare 
data can be analyzed by using real-time analytics to achieve 
revolutionized predictions. 

Electronic Medical Records (EMR) and Electronic Health 
Records (EHR) are structured in their nature where data is 
stored in the form of fields such as patient id, name of treat- 
ment undergone, etc., which can be handled with automated 
databases.Innovative advances in participatory internet make 
social media platforms such as Facebook an inescapable plat- 
form for health care promotion and education.Big data analysis 
may be done in predicting the performance of Facebook post. 

There are some comparative methods and performance in- 
dicators are used to predict the post.They are Artificial Neural 
Network(ANN),Deep Neural Network(DNN) and KNearest 
Neighbours(KNN). Within the public health paradigm, the 
field of health informatics deals with: "the structures and 
processes, as well as the outcomes involved in the use of 
information and communication technologies (ICTs) within 
health”. Situated within the new public health informatics 
field, it compares different methods to study attributes of 
healthcare posts. Furthermore, the suitability of different al- 
gorithms is solely dependent on the data characteristics , 
therefore, there is a need for further in-depth analysis to find 
the suitable unsupervised and supervised machine learning 
algorithms to derive meaningful facts and actionable insights 
from social data. 

The individuals and organizations upload 350 million photos 
to Facebook per day and generate 4 million likes every minute 
. Thus, the majority of posts on social media go unnoticed by 
target users or even worse, inaccurate or misleading informa- 
tion can go viral. 


IV. BIG DATA IN AGRICULTURE 


The use of big data for agricultural analysis can help to 
provide a better understanding of agriculture for farmers and 
also government agencies. 

While this kind of approach has been widespread in many 
industrial sectors, Thai agriculture has not yet seen its im- 
plementation on a wide scale. This may partly explain the 
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economic successes in industry while agriculture has lagged 
behind, especially in terms of worker remuneration. One 
further problem is that a majority of farmers do not have the 
education required to take advantage of technology and data 
analysis. 

This study therefore seeks to establish a framework to 
support the use of data analytics in the agricultural context, 
through the development of a web based application capable 
of displaying performance data in farming and thus solving 
the key issues in the agricultural sector to support farmers. 
The framework will apply a number of software solutions 
to support agricultural production across various disciplines. 
The information provided will assist farmers in managing their 
operations, and will guide Government departments in creating 
policies and plans for Thai Agriculture in order to develop a 
modern and efficient farming. 

The main purpose of Big Data tools is to increase produc- 
tions in order to offer higher quantities while ensuring higher 
quality products. However, there remain some issues that need 
to be accomplished. 


V. DATA SET DESCRIPTION 
A. Predictive Big Data Analytics in Healthcare 


To improve the quality of healthcare, Big Data Analytics are 
required in healthcare for providing quick services to patients 
in order to get relief at faster rates by reducing the over usage 
of drugs or medicine. 

The healthcare data can be analyzed with Big Data An- 
alytics to predict patient diseases and to suggest suitable 
drugs needed. The real-time data for future applications in 
detecting hazardous diseases and infections are predicted as 
soon as possible. These predictions will help the hospital 
outbreaks and reduce patient morbidity. The real-time data 
acquired from ICU monitors and other types of equipment 
rectify life-threatening infections as fast as possible. Such high 
voluminous healthcare data can be analyzed by using real time 
analytics to achieve revolutionized predictions. 

Electronic Medical Records (EMR) and Electronic Health 
Records (EHR) are structured in their nature where data is 
stored in the form of fields such as patient id, name of 
treatment undergone, etc., which can be handled with auto- 
mated databases. The key challenges of Big Data Analytics in 
the healthcare domain include capturing, storing, searching, 
sharing and analyzing healthcare data. 

The way of organization of data after extraction from 
different layers and integration of it is also a challenging 
task. Quality information in each level should be checked by 
applying security and protection methods.Big Data transforms 
the healthcare sector by improving the outcomes by applying 
potential healthcare analytics. Doctors can make quick deci- 
sions based on the results, which are achieved by applying Big 
Data Analytics. 

The healthcare sector business is getting profits with the 
advent of analytics. The developments of healthcare standards 
can improve in identifying and predicting diseases at an 
early stage and can be cured in minimum time. Since every 


record is unique and its entry is made in, the corresponding 
dataset will be managed by applying analytics. Inefficiency 
in healthcare data is eliminated by different approaches like 
Clinical operations, Evidence based medicine, Web-based Pre- 
adjudication of fraud analysis, and remote patient monitoring. 


B. Public Health Care Analysis using Facebook 


Facebook is a social media platform that can bring a lot of 
information to people. There are two types of posts, one is 
what users are looking at and other is users are not interested. 
By making the differentiation of posts it will help in Public 
health and care organization. Getting an idea of how to convey 
information to people, so if you put a post on Facebook it 
will reach a lot of people. There is a limit to the delivery of 
information through traditional media such as TV, Radio and 
Newspaper. All you have to do is post on Facebook. 

To find posts that are relevant to the user, the post is based 
on 11 characteristics : Post Type ,Hour Span ,Facebook Wall 
Category , Level ,Country , isHoliday ,Season ,Created Year 
„Month ,Day of the Week and Time of the Day. There are 
some methods used to find this out. That is Artificial Neural 
Network(ANN), Deep Neural Networks(DNN) and Network 
Topologies. To establish a post engagement frame and find the 
right classification model of public health care organizations 
with their social media strategy. Specifically what type of 
content to post and when. 


VI. ALGORITHMS 


They can use two clustering algorithms and two classi- 
fication algorithms. The clustering algorithms are Gaussian 
Mixture Model(GMM) and K- means. K Nearest Neigh- 
bours(KNN) and Artificial Neural Network(ANN) are classi- 
fication algorithms. The classification algorithm achieves the 
prediction rates of higher and lower accuracy. The ANN is 
applied to a big social data set from Facebook to forecast 
certain attributes and their likelihood of belonging to one or 
the other of the popularity clusters. 


A. Artificial Neural Networks(ANN) 


Artificial Neural Networks(ANN) were used previously in 
text data and sentiment classification. In DAN2 (a Dynamic 
Architecture for Artificial Neural Networks),the feed forward 
approach is used. It will have input, hidden and output layers. 
Hidden layers are generated dynamically until the desired level 
is reached. Data set analysed in this section has a combination 
of quantitative and qualitative attributes. Some of the attributes 
were part of the data from the beginning, some were derived 
to achieve better insight into features and some attributes 
were disregarded at this point, as they were not relevant for 
the analyses or already served a purpose for other derived 
attributes. 

Artificial Neural Networks (ANN) will be applied to esti- 
mate which attributes contribute the most to the prediction 
results. Additionally, the classification matrix will show if 
performance of the model is better than simply predicting all 
outputs to be the largest class in the training data set. The goal 
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of ANN learning algorithm is to determine a set of weights 
that minimize the sum of squared errors” according to Pang 
et al. 


ix 
E(w) = N X (ui — ĝi)’ (1) 
n=1 


B. Deep Learning 


Deep Learning is a 3-4 layer neural network. There are 
few architectural uses associated with the DNN. There are 
three of them such as convolutional neural networks, deep 
belief networks or Boltzmann machines. It uses GPU. This 
improves the run time and tries not to further overflow the data. 
But the GPU is not used in this paper. There are numerous 
studies mentioned in the Related work section that argue for 
Deep neural network (DNN) to show improved performance 
in comparison to ANN and therefore the 3rd hypothesis was 
tested with the current data set. Algorithm ran 14 times, while 
adding a few additional layers each time . All other attributes 
such as 10 hidden units, 10 network and file size were kept 
unchanged. 
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Fig. 2. Deep Neural Network with 5 hidden layers 


This figure shows a deep neural network that achieved 
accuracy of 67% with 5 hidden units in each of the 5 hidden 
layers, three output nodes that correspond to engagement 
clusters and 8 input attributes. Shallow networks perform on 
the same level or better than big deep neural nets when applied 
to public health-care data from Facebook presented here. 


C. K-Nearest Neighbour 


In K-Nearest Neighbours , post popularity can be pre- 
dicted using three engagement clusters and eight independent 
attributes. The query points in the KNN algorithm will be 
classified as class 0, 1 or 2. This figure shows the classification 
process with K = 10 and K = 3. 

If K is too small, then the nearest neighbour classifier can 
be susceptible to over-fitting. It can lead to higher error rate 
predictions on the new data. If k is too large, the nearest 
neighbour algorithm may miss-classify the test instance. To 
make predictions with KNN, one needs to decide on a metric 
for measuring the distance between the query point and points 
assigned to classes. KNN algorithm showed better accuracy 
results than ANN with 2hidden layers , 2 hidden units and 
trained sample size may be 5%. 
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Fig. 3. K Nearest Neighbours 


VII. BIG DATA ANALYTICS IN SMART AGRICULTURE 


The major problem we are facing nowadays is the food 
shortage. Growing population is the main factor that brought 
fear of food shortages. There are so many challenges faced 
by food producers such as pollution of contamination of 
lands, water, resources, climate change etc. The increase in 
food production must be accomplished while reducing the 
environmental impact, which at present is claimed to amount 
to around 20% of current man-made Greenhouses Gas (GHG) 
emissions. There are various methods that are used to address 
these problems, especially crop monitoring and yield man- 
agement through the use of satellite data. Modern agriculture 
is conducted with the support of biotechnology and latest 
technologies such as internet of things, cloud and remote 
sensing to create the idea of smart farming. 

The use of big data in the agricultural sector would require 
the use of big data in the agricultural sector would require 
significant investments in the infrastructure needed for data 
handling especially where real time processing is needed, 
as would be the case in weather and epidemic forecasting 
and also in monitoring the effects of pests. Big data would, 
however, permit both authorities and farmers to obtain valuable 
economic guidance from large quantities of data under analysis 
In other industries, the analysis of big data has already proven 
highly effective. 

The financial sector has been one beneficiary, while the 
online behaviour of customers is now better understood, and 
the techniques have been useful in environmental studies. 
At the governmental level, the use of big data has been 
highly effective in managing services provided for citizens in 
addressing the matters of health care, the economy, disaster 
relief, and job creation, among others. 

The structure of the study involves an examination of the 
literature covering a number of different farming activities 
which include strategies and marketing, management and in- 
formation systems, and the use of new technologies as well as 
big data. These ideas are then combined to create a systematic 
system of classification and prediction which is based on 
the use of Big data. The aim is to create four strategies 
on the basis of individual interactions and the Enterprise 
Strategy Map the most effective form of target marketing 


Ann Annice Paul et al., “Big Social Data Analytics on Various Socio-Economic Factors” 42 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


involves the One-to-one learning relationship which is cost 
effective and allows individual offers to be personalized and 
designed for each buyer. The idea of the learning relationship 
refers to the gathering of data from the interactions recorded 
between customers and farmers. Therefore, there are more 
marketers today who are considering personalization strategies 
to enhance the marketing activities of farmers since this can 
make use of the advantages offered by one-to-one marketing 
and the cultivation of the relationship between buyers and 
sellers. 

The aim of the model is to provide predictions for an objec- 
tive variable by making basic choices according to directions 
obtained from the highlights of the information available. The 
aim of this work is to present Data technologies and tools for 
data storage and analysis. Thus, our goal is to understand how 
they work, the benefits they offer, understand the complexity 
and the need to use them together. We will also present Smart 
Farming and how Big Data is revolutionizing this sector in 
many different ways. To do that, a bibliography study was 
conducted in three steps: (1) collect related works,(2) spotting 
the relevant work and finally (3) resume and analyse the direct 
implied work. This research is mainly related to agriculture 
and livestock and its goal is to ensure healthy and sustainable 
food. Those studies have been published under several names: 
Smart Farming (SF), Sustainable Farming (Agriculture), Pre- 
cision Agriculture (PA), etc. In this study, we will refer to 
these concepts as Smart Farming. Smart Farming is a global 
initiative that aims to preserve resources and to ensure both 
sustainable and healthy agriculture and livestock with the help 
of advanced technologies. Smart farming encompasses many 
aspects other than growing crops such as animal husbandry 
and many other areas (beekeeping, etc.). Five specific fields 
are commonly covered by Sustainable 

Agriculture (SA), Smart Sustainable Agriculture (SSA), and 
Precision Agriculture (PA) in agriculture. It includes soil, 
crops, machines, irrigation and water, pest and fertilization. 
For Livestock farming or Precision Livestock Farming we 
can consider five categories. It includes animal behavior, 
genetic testing enables the control of the genomics of livestock 
reproduction and the extent to which researchers can have 
confidence in the information regarding agricultural systems 
that are used for the production in agriculture, animal welfare, 
Nutrition management, species protection. For PA and LA we 
have common categories such as climate change, resilience 
Productivity, Humans, Sustainability. 


A. Smart Farming (SF) 


Smart Farming is a development that emphasizes the use 
of information and communication technology in the cyber- 
physical farm management cycle. New technologies such as 
the Internet of Things and Cloud Computing are expected 
to leverage this development and introduce more robots and 
artificial intelligence in farming. This is encompassed by the 
phenomenon of Big Data, massive volumes of data with a wide 
variety that can be captured, analysed and used for decision- 
making. This review aims to gain insight into the state-of-the- 


art of Big Data applications in Smart Farming and identify the 
related socio-economic challenges to be addressed. 

Following a structured approach, a conceptual framework 
for analysis was developed that can also be used for future 
studies on this topic. The review shows that the scope of 
Big Data applications in Smart Farming goes beyond primary 
production; it is influencing the entire food supply chain. Big 
data are being used to provide predictive insights in farming 
operations, drive real-time operational decisions, and redesign 
business processes for game-changing business models. Sev- 
eral authors therefore suggest that Big Data will cause major 
shifts in roles and power relations among different players in 
current food supply chain networks. 

The landscape of stakeholders exhibits an interesting game 
between powerful tech companies, venture capitalists and 
often small start-ups and new entrants. At the same time there 
are several public institutions that publish open data, under the 
condition that the privacy of persons must be guaranteed. 

The future of Smart Farming may unravel in a continuum 
of two extreme scenarios closed, proprietary systems in which 
the farmer is part of a highly integrated food supply chain or 
open, collaborative systems in which the farmer and every 
other stakeholder in the chain network is flexible in choos- 
ing business partners as well for the technology as for the 
food production side. The further development of data and 
application infrastructures (platforms and standards) and their 
institutional embedment will play a crucial role in the battle 
between these scenarios. From a socio-economic perspective, 
the authors propose to give research priority to organizational 
issues concerning governance issues and suitable business 
models for data sharing in different supply chain scenarios. 


B. Sustainable Farming (Agriculture) 


Sustainable agriculture is farming in sustainable ways meet- 
ing society’s present food and textile needs, without compro- 
mising the ability for current or future generations to meet 
their needs. It can be based on an understanding of ecosystem 
services. There are many methods to increase the sustainability 
of agriculture. When developing agriculture within sustainable 
food systems, it is important to develop flexible business 
processes and farming practices. Agriculture has an enormous 
environmental footprint, playing a significant role in causing 
climate change, water scarcity, water pollution, land degrada- 
tion, deforestation and other processes; it is simultaneously 
causing environmental changes and being impacted by these 
changes.[4] 

Sustainable agriculture consists of environment friendly 
methods of farming that allow the production of crops or 
livestock without damage to human or natural systems. It 
involves preventing adverse effects to soil, water, biodiversity, 
surrounding or downstream resources—as well as to those 
working or living on the farm or in neighbouring areas. 
Elements of sustainable agriculture can include permaculture, 
agroforestry, mixed farming, multiple cropping, and crop ro- 
tation. 
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C. Precision Agriculture(PA) 


In modern agriculture, historical and real-time data is gath- 
ered in unstructured and structured data sets. The unstructured 
form of data that is collected from precision agriculture In 
precision agriculture it is used to find any useful information 
from them. The priority of modern agriculture is to increase 
efficiency and productivity while reducing the initial farm 
input costs. Big data supports different forms of precision 
agriculture functions by discovering intelligence and insights 
from the collected data in order to solve farming problems and 
inform farming decisions. ICT continues to play this vital role 
— improving productivity and eradicating hunger and unwanted 
agricultural investments. 


Soils/Biodiversity 


Pests/Fertilization 


irrigation Water 


Fig. 4. Smart Farming 


D. Algorithms 


The algorithms employed in the course of this research are 
described as follows in greater detail: 

1) Simple Linear Regression: Simple linear regression is 
used to find out the best relationship between a single input 
variable (predictor, independent variable, input feature, input 
parameter) and output variable (predicted, dependent variable, 
output feature, output parameter) provided that both variables 
are continuous in nature. This relationship represents how an 
input variable is related to the output variable and how it is 
represented by a straight line. 

To understand this concept, let us have a look at scatter 
plots. Scatter diagrams or plots provide a graphical repre- 
sentation of the relationship of two continuous variables.This 
model approach allows one variable to be forecast in terms 
of another variable. The variable which is predicted is the 
measured variable, and is given mathematically as Y. The 
indicator variable, which is used to construct the forecast, 
is known as X. For each of the points at which there is an 
indicator variable available, the prediction technique is known 
as Straightforward relapse. When basic direct relapse occurs, 
X is used to forecast Y, and the results of the plotted outcomes 
will appear in the form of a straight line.After looking at 
scatter plot we can understand: 


e The direction 


e The strength 
e The linearity 


The above characteristics are between variable Y and variable 
X. The above scatter plot shows us that variable Y and variable 
X possess a strong positive linear relationship. Hence, we can 
project a straight line which can define the data in the most 
accurate way possible.If the relationship between variable X 
and variable Y is strong and linear, then we conclude that 
particular independent variable X is the effective input variable 
to predict dependent variable Y. 

To check the collinearity between variable X and variable 
Y, we have correlation coefficient (r), which will give you 
numerical value of correlation between two variables. You 
can have strong, moderate or weak correlation between two 
variables. Higher the value of “r’, the higher the preference 
given for particular input variable X for predicting output 
variable Y. Few properties of “r” are listed as follows: 


Scatter Plot 


x 


Fig. 5. Simple Linear Regression 


e Range of r: -1 to +1 

e Perfect positive relationship: +1 

e Perfect negative relationship: -1 

e No Linear relationship: 0 

e Strong correlation: r > 0.85 (depends on business sce- 
nario) 


If r < 0.85 then use transformation of data to increase the 
value of “r” and then build a simple linear regression model 
on transformed data. 

2) Decision tree: This approach offers a regulated and 
Nonparametric strategy which can achieve both order and 
relapse.A Decision Tree is an algorithm used for supervised 
learning problems such as classification or regression. 

A decision tree or a classification tree is a tree in which 
each internal (non leaf) node is labeled with an input feature. 
The arcs coming from a node labeled with a feature are labeled 
with each of the possible values of the feature. Each leaf of the 
tree is labeled with a class or a probability distribution over 
the classes. A tree can be ”learned” by splitting the source set 
into subsets based on an attribute value test. This process is 
repeated on each derived subset in a recursive manner called 
recursive partitioning. The recursion is completed when the 
subset at a node has all the same value of the target variable, 
or when splitting no longer adds value to the predictions. This 
process of top-down induction of decision trees is an example 
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of a greedy algorithm, and it is the most common strategy for 
learning decision trees. 
Decision trees used in data mining are of two types: 


e Classification tree when the response is a nominal vari- 
able, for example if an email is spam or not. 

e Regression tree when the predicted outcome can be 
considered a real number. 


Decision trees are a simple method, and as such have 
some problems. One of these issues is the high variance in 
the resulting models that decision trees produce. In order to 
alleviate this problem, ensemble methods of decision trees 
were developed. There are two groups of ensemble methods 
currently used extensively 


e Bagging decision trees: 
These trees are used to build multiple decision trees by 
repeatedly resampling training data with replacement, and 
voting the trees for a consensus prediction. This algorithm 
has been called random forest. 

e Boosting decision trees: 
Gradient boosting combines weak learners; in this case, 
decision trees into a single strong learner, in an iterative 
fashion. It fits a weak tree to the data and iteratively keeps 
fitting weak learners in order to correct the error of the 
previous model. 


VIII. CONCLUSION 


Big Data Analytics is a type of advanced analytics that 
entails sophisticated applications that use analytics systems 
to power features like predictive models,statistical algorithms 
and what-if analysis. Big data analytics is the “sometimes 
complicated” process of analysing large amounts of data in or- 
der to find information like hidden patterns,correlations,market 
trends and consumer preferences that may aid businesses in 
making better decisions. 

The role of Big Data in today’s healthcare sectors is 
discussed with the help of Hadoop; the goal of effective health- 
care management can be achieved by providing effective data- 
driven services to people by predicting their needs. Big Data 
transforms the healthcare sector by improving the outcomes 
by applying potential healthcare analytics. Many methods are 
applied to discover features for each healthcare Facebook 
post. Problem of over-fit is a one of the reasons for accu- 
racy decline with additional hidden units of Artificial Neural 
Network(ANN). In the case of K Nearest Neighbours(KNN),it 
is an important factor to avoid over-fitting.In current research 
shows that better prediction results are obtained through K 
Nearest Neighbour(KNN). 

The use of Big Data for agricultural analysis can help 
to provide a better understanding of agriculture for farmers 
and also government agencies. This study contributes to the 
current knowledge base by establishing a conceptual frame- 
work for the use of data analysis to support the agricul- 
tural sector. As we saw in this paper Big Data provides 
efficient tools to enhance, acquire,store and analyze data 


from farming.Acquisition tool is used to collect high quantity 
mass data(volume) coming from various sources (variety) and 
transmit them to be analyzed. Big Data technologies offer a 
huge potential to boost research and development around farm- 
ing,highlight the big challenge of producing higher-quantity 
by ensuring higher-quality food,and in a sustainable way,by 
preserving natural resources and protecting ecosystems. 
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Abstract—Image Enhancement is the process of improving 
the quality and information content of original data before 
processing. By enhancing an image it is meant to remove any 
kind of noise, distortion, improve the visibility of image in any 
particular area or suppress the information of any area etc. There 
are different techniques used for image enhancing like median 
filtering, histogram equalization, linear contrast adjustment etc. 
Similarly there are also different types of images and based on 
them different techniques are used for image enhancement. In 
this paper, different types of images and techniques used for 
image enhancing is discussed; satellite images, underwater images 
and medical images. 


Index Terms—Satellite images, satellite image processing, med- 
ical images - MRI, CT-scan, X-ray, image enhancement, contrast 
stretching. 


I. INTRODUCTION 


ATELLITE images are images of Earth collected by imag- 

ing satellites operated by governments and businesses 

around the world. Satellite images are one of the most 
powerful and important tools used by the meteorologist. They 
are essentially the eyes in the sky. These images reassure 
forecasters to the behavior of the atmosphere as they give a 
clear, concise, and accurate representation of how events are 
unfolding. 

Medical imaging is the technique and process of imaging the 
interior of a body for clinical analysis and medical interven- 
tion, as well as visual representation of the function of some 
organs or tissues. Medical imaging seeks to reveal internal 
structures hidden by the skin and bones, as well as to diagnose 
and treat disease. Medical imaging also establishes a database 
of normal anatomy and physiology to make it possible to 
identify abnormalities. 

Underwater images are essentially characterized by their 
poor visibility because light is exponentially attenuated as it 
travels in the water and the scenes result poorly contrasted and 
hazy. Light attenuation limits the visibility distance at about 
twenty meters in clear water and five meters or less in turbid 
water. 
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II. SATELLITE IMAGE ENHANCEMENT 


Satellite image enhancement is the technique which is most 
widely required in the field of satellite image processing to 
improve the visualization of the features. Satellite images 
are captured from a very long distance, so they contain too 
much noise and distortions because of atmospheric barriers. 
After capturing the image, some radiometric and geometric 
corrections are carried out on it but they are not sufficient for 
all the applications. Satellite image enhancements are used to 
make it easier for visual interpretation and understanding of 


imagery. 
III. SATELLITE IMAGE ENHANCEMENT TECHNIQUES 


A. Intensity, Hue and Saturation Transformation 


Mostly used primary colours (red, green, blue or RGB 
system) are well established. An alternate approach to colour 
is the intensity, hue and saturation system (IHS). The intensity 
(I) represents brightness variations and ranges from black (0)- 
white (255), Hue (H) represents the dominant wavelength 
of colour. Saturation (S) represents the purity of colour and 
ranges from 0-255. A saturation of 0 represents a completely 
impure colour, whereas high values represent pure and in- 
tense colours. When any 3 spectral bands of sensor data 
are combined in RGB system, the resulting colour images 
lack saturation. To overcome this problem, data need to be 
transformed from RGB system to IHS system and equalization 
is performed on saturation component then data is transformed 
back to RGB system for visualization. 
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Fig. 1. Result of Intensity equalization on satellite image 


B. Density Slicing 


Density slicing converts the continuous grey tone of an 
image into a series of density intervals, or each corresponding 
to specified digital range D. Slices maybe displayed as areas 
bounded by contour lines. This technique emphasizes subtle 
grey-scale differences that maybe imperceptible to viewer. 


C. Contrast Enhancement 


Contrast Enhancement is frequently referred to as one of the 
most important issues in image processing. The problem is to 
optimize the contrast of an image in order to represent all the 
information in the input image. There are different techniques 
to overcome this issue, some are: 


e Linear Contrast Stretching. 

e Local histogram equalization(LHE). 

e General histogram equalization(GHE). 
e Decorrelation Stretching. 


IV. MEDICAL IMAGE ENHANCEMENT 


Technology today is extremely advanced and now physi- 
cians can call upon a variety of imaging techniques to help 
examine the inside of the body and therefore make an accurate 
diagnosis such as Scans and images of the body. Medical 
imaging is one of the most advanced field of imaging that 
required for creating a visual representation of the internal 
structure of human body. Medical image analysis is a particu- 
larly difficult problem due to inherent characteristics of these 
images in being low contrast, containing speckle noise, having 
signal dropouts and complex anatomical structures. Therefore, 
it is very important to enhance the contrast of such images 
before further processing and analysis. 


V. MEDICAL IMAGING MODALITIES 


Several medical image modalities have been used for an- 
alyzing anatomical structures such as bones, muscles, blood 


vessels, tissue types, pathological regions such as cancer, 
multiple sclerosis lesions. Here, three medical imaging modal- 
ities will be discussed; MRI,CT-scan and X-ray. Each one 
has its own mechanism of providing relevant physiological 
information of the organ being imaged. 


A. Magnetic Resonance Imaging (MRI) 


MRI is a special radiology technique designed to image 
internal structures of the body using magnetism, radio waves, 
and a computer to produce the images of body structures.The 
MRI scanner uses magnetic and radio waves to create pictures 
of tissues, organs and other structures within the body. The 
images that produced by an MRI scan are better in displaying 
good details. Therefore, it is possible to pictures all the tissue 
in the body by using an MRI scanner. 


Fig. 2. Brain MRI 


B. Computed Tomography (CT) 


Computed Tomography (CT) is a technique which utilizes 
X-rays in conjunction with computing algorithms to image tis- 
sues in the body. CT is one of the important diagnostic tool in 
the field of medical imaging which used to image the internal 
structure of human body. It provides good contrast amongst 
of different soft tissues of the body which especially make it 
useful for imaging the muscles, brain and cancers compared 
with other medical imaging techniques. Some advancement of 
CT machines technology have been introduced to increase the 
contrast of the CT images which are being used for diagnostic 


purposes. 
C. Radiologic Technology (X-ray) 


X-ray images are being used to image the internal structure 
of human body. In the field of medicine, it is one of the 
most widely used diagnostic tools. X-Ray is used for capturing 
images of the internal body structure that help the radiologists 
in recognizing the internal problems. It is the most useful 
imaging modality to check for the bone fractures. Although of 
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Fig. 3. A Chest CT 


several advantages of X-Ray technology, but it generates low 
contrast images. The main reason for low contrast of X-ray 
images due to presence of amount liquid in human body. One 
can increase the power of X-Rays for capturing images but it 
may harm human body/ bones. 


Fig. 4. X-ray 


VI. MEDICAL IMAGE ENHANCEMENT TECHNIQUES 
A. Histogram Equalization 


Histogram equalization (HE) is a very popular technique 
and widely used for enhancing the contrast of an image. 
Its basic idea lies on mapping the gray levels based on the 
probability distribution of the input gray levels. HE improves 
contrast by obtaining a uniform histogram and can be used on 


a whole image or just on a part of an image. This technique 
attempts to spread out the gray levels in an image and reassigns 
the brightness value of pixels based on the image histogram. 
Histogram equalization technique is effective only when the 
original image has low contrast to start with, otherwise his- 
togram equalization may degrade the image quality. 


B. Negative Image 


The basic primitive process of this technique is to calculate 
the negative of an image. To calculate the negativity of an 
image the pixel intensity standards are inverted. For sample, 
if an image is of dimensions R x C, where R belongs to the 
count of rows and C will represent the count of columns, and 
then it is denoted using I(r, c). The negative given as N(r, c) 
of the image given as I(r, c) can be figured as in equation [4]: 


N(r,c) = 255-I(r, c) 


where, 0 < r < R and O < c < C. Here, from the value 
255 each pixel value is deducted from the original input 
image. Then we develop restored image negative. Meant for 
enhancing the white or grey details confined in the dusky areas 
of the image the negative images are useful. 


Fig. 5. Original and Negative of an Image 


C. Contrast Stretching 


Contrast stretching is a simple image enhancement tech- 
nique that attempts to improve that contrast in an image by 
stretching the range of intensity values it contains to span a 
desired range of values, the full range of pixel values that 
the image type concerned allows. Contrast stretching changes 
the distribution and range of the digital numbers assigned to 
each pixel in an image. This is normally done to accent image 
details that may be difficult for human viewer to observe. 


VII. UNDERWATER IMAGE ENHANCEMENT 


Underwater images mainly suffer from the problem of poor 
color contrast and poor visibility. These problems occurred due 
to the scattering of light and refraction of light while entering 
from rarer to denser medium. Scattering causes the blurring 
of light and reduces the color contrast. These effects of water 
on underwater images are only not due the nature water but 
also because of the organisms and other material present 
in the water. Many techniques and methods are established 
by researchers to solve the problem of underwater image 
enhancement. 
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Original Image 


Contrast Enhanced Image 


Fig. 6. Enhancement of an Image after Contrast Stretching 


VIII. UNDERWATER IMAGE ENHANCEMENT TECHNIQUES 
A. Anisotropic Filtering 


It is a technique aiming at reducing image noise without 
removing significant part of the image content. This channel 
smoothes the picture in homogeneous range however conserve 
edges and upgrades them. It is utilized to smooth compositions 
and diminishes relics by erasing little edges enhanced by 
homomorphic filtering. 


B. Integrated Color Model 


In underwater situations, clarity of images are degraded 
by light absorption and scattering. This causes one colour 
to dominate the image. In order to improve the clarity of 
underwater images. In integrated color model first step is to 
diminish the color cast by the equalization of all the color 
values present. In the second step an improvement is applied 
to the contrast amendment to broaden the histogram values of 
the red color. Second step is again done for green and blue 
colors. In the last step of the model, the saturation and intensity 
components of the HSI color model is applicable for contrast 
adjustment to enhance the true color and for dealing with the 
issue of uneven illumination. 


C. Red Channel Method 


In this Red Channel method where colors associated to short 
wavelengths are recovered, as expected for underwater images, 
leading to a recovery of the lost contrast. The Red Channel 
method can be a variant of the Dark Channel method. The 
first thing in this method to estimate is the color of the water. 
Pick a pixel that lies at the maximum depth with respect to the 
camera. It is assumed that degradation of image depend upon 
location of pixel. After estimating the waterlight transmission 
of the scene is estimated. Then Color correction is done. 


IX. CONCLUSION 


There are different methods to implement image enhance- 
ment techniques based on different images. The selection 
of method depends on the input image type, all the above 
mentioned techniques or methods helps in making the image 
noise free, makes image more visible, eliminates distortions 
etc. Different techniques for each image such as contrast 
enhancement, density slicing, histogram equalization, contrast 
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Fig. 7. Underwater Image Enhancement Using an Integrated Color Model 


Fig. 8. Red Channel Underwater Image 


stretching, anisotropic filtering, and integrated color model 
have been discussed. 
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Abstract—This paper discusses various security issues for 
mobile cloud computing. Mobile cloud computing(MCC)is a 
trending technology employed in several domains to overcome the 
limitations of mobile devices by using cloud capabilities. Com- 
munication between mobile devices and cloud is maintained via 
wireless media to make use of cloud services. Hence, MCC models 
show vital security issues related to many disciplines, especially 
authentication, privacy and trust. Current MCC models lack 
the ability to secure and protect data, resources, communication 
channels, and authentication (client-to-cloud and cloud-to-client). 
MCC security issues have appeared mainly as a result of the 
integration between mobile devices and cloud computing. 

Cloud computing is a distributed computing system that offers 
managed, scalable and secured and high available computation 
resources and software as a service. A lot of risk is associated if 
the storage and data processing are migrated from the mobile to 
clouds. User’s privacy and integrity of data and applications is 
one of the key issues most of the cloud provider give attention. 
This paper discusses the various security issues for mobile cloud 
computing. It also identifies the main vulnerabilities in these types 
of systems and the preventive measures that could be these to 
deal with such problem. To attain more security in mobile cloud 
environment, threats need to be addressed and studied. 

Index Terms—Cloud computing(CC), mobile cloud computing 
(MCC), IaaS, PaaS, SaaS, security, security issues, solutions, tools 
for solutions. 


I. INTRODUCTION 


OBILE Cloud Computing (MCC) is the combina- 
Me of cloud computing and mobile computing to 

bring rich computational resources to mobile users, 
network operators, as well as cloud computing providers. 
The ultimate goal of MCC is to enable execution of rich 
mobile applications on a plethora of mobile devices, with a 
rich user experience.MCC provides business opportunities for 
mobile network operators as well as cloud providers. More 
comprehensively, MCC can be defined as ”a rich mobile 
computing technology that leverages unified elastic resources 
of varied clouds and network technologies toward unrestricted 
functionality, storage, and mobility to serve a multitude of 
mobile devices anywhere, anytime through the channel of 
Ethernet or Internet regardless of heterogeneous environments 
and platforms based on the pay-as-you-use principle.” 
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Fig. 1. Schematic view of mobile cloud computing 


As the need of information storage, retrieval and computing 
are increasing day by day, the approach of organization are 
moving towards the distributed architecture from the tradi- 
tional monolithic processing and storage model to a cloud 
based approach. MCC is growing rapidly among the users 
due to anytime data access. At present there is wide range of 
mobile cloud applications available. It can overcome from its 
existing limitations of the mobile devices which are nowadays 
get used with help of future cloud based network and mobile 
enabled. It explain as the service of cloud offer to the mobile 
users with a variety of applications irrespective of connectivity 
strength , operating systems and memory capacity. Huge 
amount of users access software with the help of cloud through 
data center which are places would wide. 


II. CLOUD COMPUTING 


Cloud computing is the on-demand availability of com- 
puter system resources, especially data storage (cloud storage) 
and computing power, without direct active management by 
the user.Large clouds often have functions distributed over 
multiple locations, each location being a data center. Cloud 
computing relies on sharing of resources to achieve coher- 
ence[clarification needed] and economies of scale, typically 
using a ”pay-as-you-go”° model which can help in reducing 
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capital expenses but may also lead to unexpected operating 
expenses for unaware users. 


III. CLOUD SERVICES 


Generally cloud services can be divided into three cate- 
gories: 

1) Software-as-a-Service (SaaS): 
SaaS can be described as a different software applica- 
tions over the Internet. This makes the customer to get 
rid of installing and operating the application on own 
computer and also eliminates the tremendous load of 
software maintenance; ontinuing operation, safeguarding 
and support. 

2) Platform as a Service (PaaS): 
PaaS is the delivery of a computing platform and solu- 
tion stack as a service without software downloads or 
installation for developers, IT managers or end-users. It 
provides an infrastructure with a high level of integration 
in order to implement and test cloud applications. The 
user does not manage the infrastructure applications and, 
possibly, their configurations. Examples of PaaS includes: 
Force.com. 

3) Infrastructure as a Service (IaaS): 
Infrastructure as a service(IaaS) refers to the sharing 
of hardware resources for executing services using Vir- 
tualization technology. Its main objective is to make 
resources such as servers, network and storage more 
readily accessible by applications and operating systems. 


IV. MOBILE CLOUD COMPUTING 


Mobile computing means using portable devices to run 
standalone applications and/or accessing remote applications 
via wireless networks. In mobile cloud computing mobile net- 
work and cloud computing are combined, thereby providing an 
optimal services for mobile users. Data are kept on the internet 
rather than on Individual devices, providing on-demand access. 
Applications are run on a remote server and then sent to the 
user. Figure below shows an overview of the mobile cloud 
computing architecture[2]. 

The architecture of mobile cloud computing is shown in 
the Figure 1. Here the Mobile devices connect to the mobile 
wireless network base stations. Some base stations are Satellite 
and Base Transceiver Station (BTE). They act as the interface 
which establishes the network connection between the mobile 
devices and the internet. User requests are sent through the 
wireless network to access the cloud server by Authentication, 
Authorization and Accounting (AAA) mechanism. After the 
delivery of user requests to the cloud, the cloud controllers 
process those requests to provide users with the corresponding 
cloud services. 

We present the use case of an online software development 
company, ‘SWShop’. ‘SWShop’ has an online storefront to 
present their services and connect with customers. ‘SWShop’ 
has two types of operations: 


e Backend operations, which include development and test- 
ing of SW. 
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Fig. 2. Mobile Cloud Computing Architecture 


e Frontend operation, represented by sale of SW products. 


‘SWShop’ should accommodate all types of shoppers includ- 
ing those accessing the online storefront via mobile devices. 

Cloud computing can provide an effective solution. We will 
present two CC solutions, first of which is 1C-Shop where all 
SWShop operations are provided through one CSP. 2C-Shop 
scenario explores the use of two CSPs collaborating to serve 
SWShop. 


1) 1C-Shop: 
1C-Shop in a single cloud solution backend and frontend 
operations are provided through a single CSP. Here 
we can see a table that represents Amazon Web Ser- 
vice(AWS) solution credential list. In the table of AWS 
solution credential list includes type of credentials. It is 
also shows that credentials are required or not, creation 
of credentials, use of the credentials, security recommen- 
dations, and validity. 
The frontend and backend operation will all be provided 
through AWS. It allows login, usernames and passwords 
and access permissions. It also allows for to create One- 
Time Password(OTP). It includes the creation of access 
keys, RSA 2048 bit private/public key pairs etc. Some of 
these credentials are optional that are OTP and TAK. 
2) 2C-Shop 
The 2C-Shop refers to a scenario that utilizes two CSPs: 
AWS for backend operations while frontend operation 
(hosting a website) is provided by a different CSP X. 
The interacting entities in this solution are: 
a) AWS components including MC-AWS, and AWS-Sx, 
which refer to individual AWS services. 
b) CSP X components: CSP X management console MC- 
X, and X-Sy, Which refers individual CSP X services. 
c) SWShop, represented by Admin, Developers Uj, TUi, 
which refers to temp users assuming a defined role 
such as the role of a website visitor. 
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V. GENERIC MODEL MC-MODEL 


Based on the scenarios presented in previous sections, we 
introduce a generic model for M2C. MC-Model describes 
entities and interactions among them for multiple CSPs col- 
laborating to perform tasks for a single client. The interacting 
entities are: 


1) Client C, represented by[1]: 


Admin which refers to the administrator 

Uj, which is an employee of C 

TUi, which is a temp user assuming a pre-defined 
role such as the role of a storefront shopper 


2) For n CSPs, each CSP is represented by: 


MCSPx refers to each CSP X Management Console, 
where x is between 1 and n. 

Sy-CSPx refers to each service offered by each CSP. 
So, S2-CSP1 would refer to service S2 offered by 
CSPI. 


3) Financial Third Party, F. Varying credentials can be used 
to secure these interactions. The credentials structure for 
MC-Model is as follows : 


C1. 


C2. 


C3. 


C4. 


C5. 


Có. 


Username and password (PWDroot) are the login 
credentials for the root account. It is required. It 
should be changed periodically and not distributed 
to other users. It stays valid until changed. 
Individual user accounts credentials, PWDu, are for 
login by an individual user. They should be set by 
the admin and distributed to other users affiliated 
with the Client C. They stay valid for a set period 
then users are required to change them. 

OTP’s are provided by an MFA Hardware or soft- 
ware generator. It can be used for the root ac- 
count (OTProot) and/or the individual user accounts 
(OTPu). They are optional and used once. 
Non-certified Public/ Private Key Pairs can be cre- 
ated for an individual user (PRKU /PUKU) or the 
root account (PRKroot /PUKroot). They can be 
created by the CSP, or by a third-party. The PUK 
is saved by both CSP and the user. The PRK is kept 
by the user and is used to encrypt user requests to 
access services. 

Certified Public/Private key pair (X509-PUK/PRK). 
The user signs SOAP-protocol requests to service 
interfaces with his X509-PRK. 

Secret access keys are created for the account ad- 
min (SAKroot) or an individual user (SAKU). The 
SAK is a shared secret between the User and the 
CSP used to sign programmatic requests to Cloud 
services. When a User application issues a request 
for a Cloud service, SAK is used to calculate an 
HMAC signature transmitted as part of the HTTP 
Authorization header or as part of the URL. 
Credentials C4-C6 are recommended to be rotated 
periodically. They stay valid until manually deleted 
by the user or revoked by an admin. 


TABLE I 
CREDENTIAL ITEMS AND INTERACTING ENTITIES 


Items Admin | User Uj | Temp User TUj | CSP 


PWDroot Y N N 


PWDj 


OTProot 


OTP} 


PUKroot 


PRKroot 


PUKuser 


PRKuser 


X509-PUK 


X509-PRK 


SAKIDroot 


SAKroot 


SAKIDj 


SAKj 


TKIDj 


ZIK ZK KIKI ZK) ZI) K IKI KY] al kK] 
ZiK|KIL KL ZI KL KIKI KEK] ZL KIL KI ZK 
KIK Z| Z| Z| Z| Z| Z| Z| Z| Z| 2/2) 2)/2 
ZIK Z|K| Z| AALI WALIA WA 


TAKj 


C7. Temp access Keys (TAK) are created for tempo- 
rary access roles TUs. They are used in a similar 
manner to SAK but are issued with limited life for 
temporary roles. A role is a task defined by the 
admin with attached permissions to access specified 
resources and credentials. An example is the role 
of un-registered shopper of an online storefront. 
This role may allow read-only permissions to some 
information of the product database of the store and 
includes the credentials necessary to access it. They 
are valid for as long the role is assumed by a user. 


Table 2 provides a summary of all credential items, and 
interacting entities showing who (Entities) knows what (cre- 
dentials) for MC-Model[1]. 

For M2C Model, the execution workflow goes as follows: 


1) 
2) 


3) 


4) 


5) 


The admin registers for CSPs and creates credentials C1- 
C7 described above. 

Admin or individual users Uj use credentials from (1) to 
login to CSPs. 

Admin or individual users Uj request needed CSP ser- 
vices through MCSP. SSL protects communicated mes- 
sages secrecy and authentication for steps (1-3). 

As the project necessitates, Admin or individual users 
Uj access one or more of the allocated services in 
(3). Security of access depends on the type of service 
requested. 

SSH may be used to provide a secure remote login to 
compute instances. HMAC signatures may be employed 
for other types of services. 

The running services may access other services provided 
by the same CSP. 
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6) 


7) 


The services running on one CSP may also access Until 
the task at hand is completed, steps 2 to 6 may be 
repeated. 

Each of these interactions requires two types of messages 
to be communicated between the entities: request (REQ) 
and response (RES). Both types may contain supporting 
data. 


From the workflow described above, we deduce that two 
CSPs collaborating and interacting to provide services and 
perform tasks for one Client C are sufficient to show all types 
of interactions in MC-Model. 


1) 


2) 


User to Management console of CSP (U-to-MCSP): 
Registration, security set-up, and users requesting ser- 
vices from CSPs are examples for this type of interac- 
tion[1]. 

For this type of interaction, authentication may be pro- 
vided through Passwords, MFA and OTPs. 

User to Service (U-to-S): 

An example of this interaction is when a user requests 
a service S from a CSP. For this type of interaction, 
authentication may be provided through Public/Private 
key pair, X509 certificates, or Secret Keys depending on 
the service S type. 

Cloud Service to another Cloud Service: 

There are two distinct types of this interaction: 


— Service-to-Service provided by the same CSP (Sto- 
S): 
This type of interaction occurs when one cloud 
service S1 requests another service S2 provided by 
the same CSP. For this type, a user authenticated by 
S1 is assumed to be trusted by S2. Moreover, the 
requesting service S1 trusts S2 because they reside 
in the same CSP domain. 

— Service Sa on CSP X to another Service Sb on CSP 
AWS (Sa-CSPx-to-Sb-CSPy): 
This type of interaction occurs when one cloud 
service S1 provided by CSP X requests another ser- 
vice S2 provided by another CSP Y. Authentication 
should be provided for both services mutually to 
safeguard the user’s data and processes as well as 
other clients of both CSPs. 


VI. SECURITY ISSUES INCLOUD COMPUTING 


Data Integrity 

When a data is on a cloud anyone from any location 
can access those data from the cloud. Cloud does not 
differentiate between a sensitive data from a common data 
thus enabling anyone to access those sensitive data[3]. 
Data Theft 

Most of the cloud Vendors instead of acquiring a server 
tries to lease a server from other service providers because 
they are cost affective and flexible for operation. The 
customer doesn’t know about those things, there is a high 
possibility that the data can be stolen from the external 
server by a malicious user. 


3) 


4) 


5) 


6) 


7) 


8) 


9) 


Privacy Issues 

The Vendor must make sure that the customer Personal 
information is well secured from other operators. As most 
of the servers are external, the vendor should make sure 
who is accessing the data and who is maintaining the 
server thus enabling the vendor to protect the customer’s 
personal information[7] . 

Infected Application 

Vendor should have the complete access to the server 
for monitoring and maintenance, thus preventing any 
malicious user from uploading any infected application 
onto the Cloud which will severely affect the customer. 
Data Location 

When it comes to location of the data nothing is transpar- 
ent even the customer don’t know where his own datas 
are located. The Vendor does not reveal where all the data 
are stored. The data won’t even be in the same country of 
the Customer, it might be located anywhere in the world. 
Security on Vendor level 

Vendor should make sure that the server is well secured 
from all the external threats it may come across. A Cloud 
is good only when there is a good security provided by 
the vendor to the customers. 

Security on User level — Even though the vendor has 
provided a good security layer for the customer, the 
customer should make sure that because of its own action, 
and there shouldn’t be any loss of data or tampering of 
data for other users who are using the same Cloud. 
Lack of Standards 

The immaturity of this technology makes it difficult to 
develop a comprehensive and commonly accepted set of 
standards. 

Interoperability issues 

The cloud computing technology offers a degree of 
resource scalability which has never been reached before. 
Cloud providers may find the customer lock-in system 
attractive, but for the customers interoperability issues 
mean that they are vulnerable to price increases, quality of 
services not meeting their needs, closure of one or more 
cloud services, provider going out of business, disputes 
between with the cloud provider. 

Latency 

This has always been an issue in cloud computing with 
data expected to flow around different clouds. The other 
factors that add to the latency are: encryption and decryp- 
tion of the data when it moves around unreliable and pub- 
lic networks, congestion, packet loss and windowing[3]. 


VII. SECURITY MEASURES IN MOBILE CLOUD 
COMPUTING 


Since the security issues fall in two categories, the security 


1) 


measure is also described as[2]: 


A. Mobile Network User’s Security 


Don’t leave your mobile device unattended. 
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Fig. 3. Workflow of MC Model 


2) Protect Your Device with Passwords: Enable your de- 
vice’s power-on login, system login authentication, and 
password-protected screen saver. 

3) Disable Wireless Connection When It Is Not In Use: 
WiFi, infrared, and Bluetooth devices are constantly 
announcing their presence if they are enabled. 

4) Protect your device with anti-virus software using the 
latest virus definitions. 

5) Remove Your Preferred Network List When Using Public 
Wireless Service. 

6) Encrypt Your Wireless Traffic Using a Virtual Private 
Network (VPN). 

7) Turn off Ad-Hoc Mode Networking. 

8) Turn off Resource Sharing Protocols for Your Wireless 
Interface Card. 


B. Measures for Cloud Security 


The data can be encrypted to reduce the impact of a breach, 
but if the encryption key is lost, the data is also lost. However, 
if offline backups of the data are kept to reduce data loss, the 
exposure to data breaches increases. 

A malicious hacker might delete a target’s data out of spite 
— but then, the data could be lost to a careless cloud service 
provider or a disaster, such as a fire, flood, or earthquake. 
Compounding the challenge, encrypting the data to ward off 
theft can backfire if the encryption key is lost. 

The key to defending against this threat is to protect creden- 
tials from being stolen. Organizations should look to prohibit 
the sharing of account credentials between users and services, 


and they should leverage strong two-factor authentication 
techniques where possible. 

From JaaS to PaaS to SaaS, the malicious insider has 
increasing levels of access to more critical systems and even- 
tually to data. In situations where a cloud service provider 
is solely responsible for security, the risk is great. Even if 
encryption is implement, if the keys are not kept with the 
customer and are only available at data-usage time, the system 
is still vulnerable to malicious insider attack. 


VIII. ISSUES OF MOBILE COMPUTING 


Some of the Security issues relating to the Mobile cloud are 
as follows[5],[6]: 


1) Bandwidth 
It’s the one of major issue that is highlighted in a Mobile 
cloud computing environment since the radio resources 
for wireless network is in scarce compared to traditional 
wired networks. 

2) Availability 
The resources must be available for the users on the 
cloud. Mobile users must adopt a discovery mechanism 
so they can get connected to the cloud and share or access 
the resources as on the demand. 

3) Heterogeneity 
This challenge arises when the Mobile users access the 
cloud through different radio access technologies like 
GPRS, WIMAX, etc. 

4) Security 
Since the information is accessed from cloud using wire- 
less technology it’s important to secure the users infor- 
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mation and communication. For securing the information 
on cloud we would have to use cryptographic technique 
to secure the information. 

5) Authentication 
To secure the data on the cloud we have to provide 
various authentication methods, so that user has to au- 
thenticate before he accessed the cloud. 


IX. WIRELESS SECURITY ISSUES 


When hackers interrupts radio signals then only security 
related issues are get raised. Also it happens because of 
not proper management of the network by user. Because a 
private network manages most of wireless networks which are 
dependent on it and get managed by other. Hence there is 
a very less control of security. Using wireless network most 
commonly observed security issues of mobile computing are: 


A. Denial of Service (DOS) attacks: 
It is very common attacks which get find almost over all 
kinds of networks. It means attacker attacks on commu- 
nicating server and prevents the users by sending large 
amount of data using wireless network services. Which 
get resulted into slow down of network, because of it user 
not get benefit over the use of its services. [10] 
B. Traffic analysis: 
This is mostly done to check which type of network pack- 
ets are getting flowed with the networks deeply. Analysis 
of network traffic is monitored using network bandwidth 
monitoring software. Attackers mostly uses Bandwidth 
monitoring software for analysing patterns of network 
software. Also used to identify susceptible patterns to 
interruption in or to recover complex data. [11] 
Eavesdropping : 
Attacker tries to steal data or information which is get 
transmitted using machine through network is known as a 
sniffing or snooping attack. Due to insecure communica- 
tion network its easy to access the data which is get to be 
send or received, this is an advantage of eavesdropping 
attack. These kind of attacks are not easy to identify 
because they do not interrupt network transmissions which 
appear to be operating strangely. [12] 
D. Session interception and message modification: 
Sessions are get interrupted and modified by attackers over 
a transmitted data. When sender and receiver both are 
working on data that time system is cases by the attacker 
is called as man in middle attack. It’s a kind of insertion 
of attacker. [10] E. Spoofing : These kind of attacks are 
done from most reliable sources such as emails, websites. 
This kind of communication is also applicable for phone 
calls. Also occasionally applicable to technical things like 
ARP packets (Address Resolution Protocol), IP address,. 
[13] 


a 


X. PIE-CHART FOR SECURITY ISSUES 


This pie chart shows that legal and governance issues rep- 
resent a clear majority with 73 of concern citations, showing a 


Fig. 4. Security problems with grouped categories 


deep consideration of legal issues such as data location and e- 
discovery, or governance ones like loss of control over security 
and data[15]. The technical issue more intensively evaluated 
(12) is virtualization, followed by data security, interfaces and 
network security. Virtualization is one of the main novel- 
ties employed by cloud computing in terms of technologies 
employed, considering virtual infrastructures, scalability and 
resource sharing, and its related problems represent the first 
major technical concern[15]. 


XI. EXISTING SOLUTIONS TO MANAGING CLOUD 
A. Mirage Image Management System 


The integrity of VM images are the backbone for the 
entire safety and security of the cloud. In this system use 
of Filters reduces the threat in a versed method. We propose 
an image management system called Mirage that exhorts the 
security concerns adumbration in an effective style[9]. It offers 
the under-mentioned security management characteristics: A 
approach control structure that regulates the sharing of VM 
images. This decreases the publisher’s threats of unauthorized 
access to the images[8]. 


B. Client-Based Privacy Manager Privacy Manager 


Software on the client provides assistance to the appropri- 
ator to guard their privacy when reaching cloud services. The 
main aspect of the Privacy Manager is that it can confer an 
obfuscation and de-obfuscation service, to reduce the dose of 
susceptive data stored within the cloud. In addition, the Privacy 
Manager permits the user to manifest privacy precedences 
about the treatment of their personal information, inclusive 
the intensity and type of obfuscation used. Personae — in the 
form of icons that correspond to sets of privacy preferences 
can be used to simplify this process and make it more intuitive 
to the user. The user personae will be defined by the cloud 
service interaction context. 
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1) 


2) 


3) 


4) 


5) 


XII. TOOLS AND SOLUTIONS 


Acunetix This information gathering tool scans web 
applications on the cloud and lists possible vulnerabilities 
that might be present in the given web application. Most 
of the scanning is focused on finding SQL injection 
and cross site scripting Vulnerabilities. It has both free 
and paid versions, with paid versions including added 
functionalities. 

Aircrack-NG: A tool for WI-FI pen testers 

This is a comprehensive suite of tools designed 
specifically for network pen testing and security. This 
tool is useful for scanning Infrastructure as a Service 
(IaaS) models. Having no firewall, or a weak firewall, 
makes it very easy for malicious users to exploit your 
network on the cloud through virtual machines. 


Cain & Abel 

This is a password recovery tool. Cain is used by 
penetration testers for recovering passwords by sniffing 
networks, brute forcing and decrypting passwords. This 
also allows pen testers to intercept VoIP conversations 
that might be occurring through cloud. 

Ettercap 

Ettercap is a free and open source tool for network secu- 
rity, designed for analyzing computer network protocols 
and detecting MITM attacks. It is usually accompanied 
with Cain. This tool can be used for pen testing cloud 
networks and verifying leakage of information to an 
unauthorized third party. 

John the Ripper 

The name for this tool was inspired by the infamous 
serial killer Jack the Ripper. This tool was written by 
Black Hat Pwnie winner Alexander Peslyak. Usually 
abbreviated to just “John”, this is freeware which has 
very powerful password cracking capabilities; it is highly 
popular among information security researchers as a 
password testing and breaking program tool[14]. 


XIII. CONCLUSION 


Mobile cloud computing is a technology that combines 
the advantages of mobile networks and cloud computing. 
Cloud computing offers on-demand network access to a shared 
pool of computing resources (e.g., networks, servers, storage, 
applications, and services) that can be rapidly provisioned and 
released with minimal management effort or service provider 
interaction. This paper helps to identify what mobile cloud 
computing is and what are the challenges and the issues 


relating to the Mobile cloud computing. There are many new 
technologies emerging at a rapid rate, each with technological 
advancements and with the potential of making human’s lives 
easier. However, one must be very careful to understand 
the security risks and challenges posed in utilizing these 
technologies. Cloud computing is no exception. Still there is 
a necessity to find more innovative approaches which can to 
be put an end to the threats, and issues which are continued 
as a never ending processes. 
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Abstract—The electronic payment system has grown increas- 
ingly over the last decades due to the growing spread of internet- 
based banking and shopping. As the world advances more 
with technology development, we can see the rise of electronic 
payment systems and payment processing devices. As these 
increase, improve, and provide ever more secure online payment 
transactions the percentage of check and cash transactions will 
decrease. E-payment is very convenient compared to traditional 
payment methods such as cash or check. Since you can pay for 
goods or services online at any time of day or night, from any 
part of the world, your customers don’t have to spend time in a 
line, waiting for their turn to transact. Nor do they have to wait 
for a check to clear the bank so they can access the funds they 
need to shop. E-payment also eliminates the security risks that 
come with handling cash money. 

The Era of Information and Communication Technology 
(ICT) and digital innovation lead to dynamic changes in the 
business environment, where business transactions continue to 
shift from cash-based transactions to electronic-based transac- 
tions.The availability of e-payment technologies in the developed 
world provides opportunities for their transfer to and adaptation 
in the developing world. However, research on attempts by 
governments or ebusiness entrepreneurs to provide e-payment 
innovations in the developing world and possible institutional 
effects on such initiatives remain limited. 

Digital payment systems benefit a small fruit vendor, a 
medium-sized factory manager, or a health professional with his 
or her own office alike. When an entrepreneur can easily monitor 
their daily sales through digital payments and collections, they 
are also in a position to more effectively manage inventories and 
increase profit margins. Also, by participating in e-commerce 
through digital payments, this entrepreneur can broaden his 
customer base and visibility, thereby gaining the ability to further 
develop his business. 

Index Terms—Credit-debit payment, digital cash, smart card, 
E-cheque , digital wallet, payment Protocol, payment security, 
cyber cash. 


I. INTRODUCTION 


HE purpose of this study is to understand about various 
| e-payment systems it's security and how it can effect 


on e-business and e-commerce. 
E-commerce provides the capability of buying and selling 
of products,information and services ,on the internet and other 
online environments in an e-commerce environment,payments 
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Fig. 1. Electronic payment system 


take the form of money exchange in an electronic form ,and are 
there for called electronic payment . An e-payment system is 
a way of making transactions or paying for goods and services 
through an electronic medium, without the use of checks or 
cash. ... As these increase, improve, and provide ever more 
secure online payment transactions the percentage of check 
and cash transactions will decrease. 


The various e-payment systems are credit card,debit 
card,smart card,electronic cash, etc. Entrepreneurship is the 
process of founding new internet-based businesses. such as 
an e-payment service provider. In an e-business context, e- 
payment or online payment refers to the exchange of monetary 
value between payers and payees via the Internet or mobile 
networks. E-payment systems can conveniently and affordably 
connect entrepreneurs with banks, employees, suppliers, and 
new markets for their goods and services. These systems 
can accelerate business registration and payments for business 
licenses and permits by reducing travel time and expenses. 
Digital payments improve the speed and reduce the cost of 
payments between entrepreneurs and suppliers, employees, 
customers, and governments. The topics covered under: 


e Electronic payment methods 

e Protocols and security of E-payment 
e Business and E-payment 

e Survey and research methodology 
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Features 


TABLE I 


COMPARISON OF ELECTRONIC PAYMENT SYSTEMS 


Online credit card 


Electronic Cash 


payment 


Actual payment time 
Transaction 
information transfer 


Online and offline 
transaction 


Bank A/C involvement 
Users 


Party to which 
payment is made 


Consumer's 
transaction risk 


The current degree of 
popularity 


Anonymity 


Small payments 


Database 
safeguarding 


Transaction 
information face value 


Real/Virtual world 


Limit on transaction 


Mobility 


Paid later 
The store and bank check 
the status of credit card 


Online 


Credit card account 


Any legitimate credit card 
users 


Distributing banks 


Mostly born by distributing 
banks 

Credit card org. checks for 
certification and total 
purchases. Thus, used 
internationally 

Partially or entirely 


Transaction costs high. 
So, not suitable. 


Safeguards regular credit 
card information. 


It can be signed & issued 
freely in compliance with 
the limit. 


It can be partially used in 
the real world. 


lt depend upon credit card 
limit. 
Yes 


Prepaid 

Fee transfer. No need to 
leave the name of 
parties involved 


Online 


No involvement 
Anyone 


Store 


The consumer at risk of 
stolen or misused 


Unable to meet internet 
standards in the areas of 
potential expansion & 
Intel. 

Entirely 


Low transaction cost. 
Suitable 


Large database & 
records S. No's of use 
etc. Cash. 

Face value is often set & 
can't be altered. 


An only virtual world. 


It depends upon how 
much prepaid. 
No 


Electronic Cheque Smart Cards 
Paid later Prepaid 
Electronic checks or The smart card of both 
payment indication must parties makes the 
be endorsed transfer 
Offline allowed Offline allowed 
Bank account Smart card account 
Anyone with the bank Anyone with bank or 
account credit card a/c 
Store Store 
The consumer bears risk © Consumers-risk of 
but can stop check stolen, lost or misused 


It cannot meet 
international standards 
so not so popular. 


No anonymity 


It allows stores to 
accumulate debts until it 
reaches the limit before 
paying for it. 
Safeguards regular 
account information. 


It can be signed & 
issued freely in 
compliance with the 
limit. 

Limited to virtual but 
share checking a/c in 
the real world. 

No limit. 


No 


Like online credit cards 
and is becoming more 
widely used. 


Entirely, but if needed by 
the central processing 
agency can ask. 
Transaction costs are 
low. Like electronic 
cheque. 


Safeguards regular 
account information. 


It can be deducted freely 
in compliance with the 
limit. 

It can be used in real or 


virtual. 


It depend on how much 
money is saved. 
Yes 


II. ELECTRONIC PAYMENT METHODS 


Payment started with the barter system centuries ago. The 
major drawback of barter system was that the buyer and seller 
had to mutally like the goods that they had in surplus. This 
led to next generation of payment method called commodity 
money system. The process is simple and there is no bank 
involvement. There is however a overhead of printing notes 
and also the method is very insecure. No record of transaction 
maintained. 


A. Limitations of Traditional Payment Systems 


Several limitation of traditional payment system in the 
context of e commerce can be outlined. 

e Lack of usability 

e Lack of security 

e Lack of eligibility 

e High usage costs for customers and merchants 


e Lack of efficiency 
e Lack of consistency 


B. Electronic Payment System 


In this system, money can be transferred from one person 
to another electronically through various electronic payment 
system which helps the customer to make online payments, 
or in other words to transfer money over the internet.Money 
can be transferred from one person to another electronically 
through various electronic payment systems. 

There are various types of e-payment methods such as credit 
card, debit card, smart card, digital wallet or electronic wallet 
and electronic cheque. 

1) Credit card: 

A credit card is a thin rectangular piece of plastic or metal 
issued by financial institutions, which lets you borrow 
funds from a pre approved limit to pay for your purchases. 
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Debit Card 
online & offline 


Electronic Check 
online : 


B2B 


Fig. 2. Classification of Electronic Payment Systems 


The limit is decided by the institution issuing the card 
based on your credit score and history. 

2) Debit card: 
Debit card is a prepaid card and also known as ATM card. 
An individual has to open an account with the issuing 
bank which gives debit card with a personal id number, 
when he makes a purchase he enter his pin number on 
shop pin pad. 

3) Smart card: 
Smart card was first introduce in Europe most of these 
method are known as stored value card .A smart card is 
about the size of a credit card, made of a plastic with 
an embedded microprocessor chip that holds important 
financial and personal information. The microprocessor 
chip is loaded with the relevant information and period- 
ically recharged. In addition to these pieces of informa- 
tion, systems have been developed to store cash onto the 
chip. The money on the card is saved in an encrypted 
form and is protected by password to ensure the security 
of the smart card solution. 

4) Digital wallet: 
Electronic wallets being very useful for frequent online 
shoppers are commercially available for pocket, palm- 
sized, handheld, and desktop PCs. They offer a secure, 
convenient, and portable tool for online shopping. 

5) Electronic cheque: 
Electronic cheque is messages that contain all the in- 
formation that is found on an ordinary Cheque but it 
uses digital signature for signing and endorsing and has 
digital certificate to authenticate bank account. There 
are many websites that accept Electronic Cheque.An 
electronic payment process that resembles the function of 
paper cheques but offers great security and more feature. 
Electronic checks are typically used in orders processed 
online and are governed by the same laws that apply to 
paper checks. 


C. Comparison of E-payment Systems 


The mode of payment E-cash, debit card and smart card are 
prepaid. E -cheque and credit card are paid later. Smart card, 


debit card and E-cheque involves offline transactions. While 
credit card and E-cash involves online transactions. 

Credit card make the payment through the credit card 
account.No bank involvement in E- cash.while other payments 
make through the account. 

For a more detailed comparison see Table I. 


III. PROTOCOLS AND SECURITY OF E-PAYMENT SYSTEMS 


Credit card is the most popular E-payment system of busi- 
ness to customer transactions. Credit card is an instruction by 
a customer for founds to be transferred in to Business account. 
Credit card number can be sent over the internet encrypted or 
unencrypted. All internet provide some level of security. 

1) Security: 

In order to retain the faith of existing on-line shoppers 
and attract new customers, it is vital that the credit card 
processing software incorporate highest levels of security. 
The Internet Fraud Watch pro-gram was established in 
1992 by the National Consumers League to monitor 
consumer fraud. The Federal Trade Commission is at- 
tempting to track fraud against business and estimates that 
fraud, security violations,and theft of intellectual property 
amounted to $250 million last year . But industry num- 
bers are difficult to track because Internet transactions 
are not yet differentiated from mail order transactions in 
credit card processing systems. In an attempt to remedy 
this, Visa will soon implement electronic commerce flags 
that track Internet transactions.Internet Fraud is affects 
both the consumer and the merchant. Internet Credit 
Card transactions are classified as “Card not present” 
transactions. Hence merchants are totally liable for the 
losses even thought the transaction has been successfully 
authorized before.If the goods do not reach the right 
consumer, then the merchant will be charged back the 
amount of transaction. 

2) Card Verification Value 2(CVV2) / Card Identification 

(CID): 

An important new security feature for card-not-present 
transactions— Card Verification Value 2, or CVV2— 
now appears on the back of most Visa cards in the 
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signature section after the credit card account number. 
American Express also has come out with a similar 3 digit 
called CID. This three-digit number helps validate that 
the customer is in possession of a genuine and legitimate 
card. 
3) CyberCash: 

CyberCash, Inc of Reston, VA was founded in August 
1994 to provide software services and solutions for secure 
financial transactions over the Internet. The CyberCash 
secure Internet Payment System, which uses a special 
wallet software, enables consumers to make secure pur- 
chases using major credit cards from Cyber cash affiliated 
merchants. The Cyber cash payment system was launched 
in 1995 and by mid 1996 over half a million users of 
Cyber cash were present. The system is mainly used to 
sell tangible goods. 


CyberCash Wallet software is the front-end application 
of CyberCash that is installed on the buyer’s machine that 
has a web browser. The corresponding merchant software 
called CyberCash Cash Register is installed on the merchant’s 
machine that has the web server. 

When the consumer clicks the ’Pay” button on the web 
browser, this message is sent across to the merchant’s server. 
The merchant software sends a summary of the item, price and 
transaction id to the consumer’s web browser. This message 
also launches the wallet software on the client. The wallet 
maintains a list of credit cards owned by the buyer and prompts 
the buyer to choose from the list. The buyer chooses the credit 
card that he wishes to use to pay and click’s the wallet’s pay 
button. This initiates CyberCash payment protocol. The card 
details are securely sent to the merchant. The merchant autho- 
rizes the payment with the financial network via CyberCash 
payment server. If the authorization is successful, goods are 
delivered to the buyer. The cash register software also does a 
capture in a batch mode every day. This will result in transfer 
of funds from the cardholder’s account to the merchant through 
the Cyber case payment server gateway. 


IV. PROTOCOL 


A. E-Payment Protocol Requirements 


The e-payment protocol encompasses three participants. 


1) User: 
The user (customer) purchases e-currency from the bank 
employing actual money by e-payment. The user can then 
utilize e-currency to carry out e-payment to buy goods. 

2) Merchant: 
The merchant is the data storage which provides user with 
both services and information. 

3) Bank: 
The bank is the trusted authority. It mediates between user 
and merchant in order to ease the duties they carry out. 
In general, the bank acts like a broker offers the e-coins 
for the e-payments. 


TABLE II 
PROS AND CONS OF E-PAYMENT 


Pros Cons 


Quick and convinient | Fraud concerns 


Ease of use Technical problems 


One click payment Increased business costs 


Increased sales 


B. Shared Set of characteristics 


While using e-currency, a shared set of characteristics for 
an e-payment protocol is: 


1) Anonymity: 
E-cash must not supply any user with information; it 
means that it must be anonymous e-currency transaction. 
2) Divisibility: 
E-cash can be sub-divided since the notes have a basic 
piece. 
3) Transference: 
E-cash can be transferred to a trusted authority by pro- 
viding the suitable amount of currency. 
4) Over spending detection: 
E-cash must be used for only once 


V. SECURE ELECTRONIC TRANSACTION (SET) PROTOCOL 


Secure Electronic Transaction or SET is a system that en- 
sures the security and integrity of electronic transactions done 
using credit cards in a scenario. SET is not some system that 
enables payment but it is a security protocol applied to those 
payments. It uses different encryption and hashing techniques 
to secure payments over the internet done through credit cards. 
The SET protocol was supported in development by major 
organizations like Visa, Mastercard, Microsoft which provided 
its Secure Transaction Technology (STT), and Netscape which 
provided the technology of Secure Socket Layer (SSL). 


VI. BUSINESS AND E-PAYMENT 


E-business or online business is a process that a business or- 
ganization conducted through the network. The transition from 
offline business to online business led to the thought about 
the online payment based on the business. Hence E-payment 
system was introduced in E-business. It was introduced at 
1960s and have been expanding rapidly. So E-payment system 
became an important factor of E-business especially in this 
generation. In the past five years the emergence of E payment 
system revolutionized the way we buy and sell goods and 
services. 

The transfer of founds electronically is a major component 
of any E-business venture, whether B2B, B2C or B2G. (It 
means Business to Business, Business to Customer, Business 
to Government). 
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A. Advantages of Digital Payment Systems 


1) Digital payment systems can conveniently and affordably 
connect entrepreneurs with banks, employees, suppliers, 
and new markets for their goods and services. 

2) These systems can accelerate business registration and 
payments for business licenses and permits by reducing 
travel time and expenses Digital financial services can 
also improve access to savings accounts and loans. 

3) Digital payments give women entrepreneurs greater con- 
trol over their income, potentially benefiting their entire 
household, especially children. 

4) Digital payments improve the speed and reduce the 
cost of payments between entrepreneurs and suppliers, 
employees, customers, and governments. Digital financial 
systems make it easier for entrepreneurs to access credit 
products to start and expand their businesses, and encour- 
age formal entrepreneurship by facilitating compliance 
with regulatory and tax obligations. 


B. Entreprenurs Benefits 


1) Digital payment systems allow entrepreneurs to pay for 
goods or services electronically, using a mobile phone, 
the internet, retail point of sales, and other broadly avail- 
able access points instead of using cash or checks. For 
entrepreneurs, especially in developing markets, access to 
digital payment platforms is more than just a convenience. 

2) For those starting a business, digital payments can speed 
up business registration and reduce the transfer time on 
payments for business licenses and permits. 

3) Moving from cash to digital payments can increase an 
entrepreneur’s profitability by reducing operating costs 
and making it easier to manage trade contracts, delivery 
records, and accounts receivables. 

4) Making and receiving digital payments can increase an 
entrepreneur’s participation in e-commerce and improve 
their interactions with clients, vendors, and financial 
institutions. 

5) Digital payments can increase an entrepreneur’s prof- 
itability by making financial transactions with customers, 
suppliers, and the government more convenient, safer, and 
cheaper. 

6) Paying wages digitally benefits employees and is safer 
and more cost-effective for employers. 

7) Digital payments improve the speed and reduce the 
cost of payments between entrepreneurs and suppliers, 
employees, customers, and governments. 

8) Digital financial systems make it easier for entrepreneurs 
to access credit products to start and expand their busi- 
nesses, and encourage formal entrepreneurship by facili- 
tating compliance with regulatory and tax obligations. 

9) Governments in developing countries can promote digital 
financial services by investing in the necessary infrastruc- 
ture, collaborating with private entities to offer training 
for potential users, and ensuring that effective security 
and regulatory measures exist. 


C. Consumer Benefits 


1) Mobile payments are Convenient 
2) Security 

3) Variable payment modes 

4) Time-efficiency 

5) Pay whenever, wherever 

In the developing world, traditional cash remains the domi- 
nant method for settling online transactions . in the developing 
world Moreover, promoting credit card culture for online 
payment in developing countries has been difficult due to low 
incomes . Also, local financial institutions, including banks, 
are unwilling to promote or accept credit card services due to 
perceived high financial and technical risks . the payer and the 
payee are unknown to each other. 

As a result, most developing-country citizens are unwilling 
to entrust their personal details to websites managed by anony- 
mous people. Yet, epayment systems’ availability and use 
remain one of the key requirements for getting e-commerce to 
work in the developing world . 


VII. SURVEY AND RESEARCH METHODOLOGY 


The survey of electronic payment system conducted by 
as a part of mobidick project to build a new generation of 
mobile hand-held computers. Here we deals with the chal- 
lenges to integrate many features in to a single architecture 
and also focus on to provide on secure electronic payment 
system.Also come across the difference between electronic 
payment protocol and electronic transaction protocol. While 
the electronic payment deals with actual money and electronic 
transaction protocol deals with whole transaction including 
Service delivery, Service acceptance, Confirmation of payment 
and receipt etc... 

Data used in this study collected basically from the sec- 
ondary sources. Primary data also collected through personal 
interview method conducting the person who is supposed to 
have knowledge about the topic. Secondary data have been col- 
lected from various sources including websites, newspapers, 
various published and unpublished article about pre-primary 
education etc. 


1) Survey Instrument Questionnaire: 

These are sent to the person concerned with request to 
answer the questions and return the questionnaire. The 
questionnaire is sent to respondent who expected to read 
and understand the question and write down the reply in 
the space meant for the purpose in questionnaire itself. A 
questionnaire consists of a number of questions printed 
or typed in a definite order on a form or set of forms. The 
respondent to have answered the questions on their own. 
Objective type questions have been designed in survey 
.Some responses have been collected from people. Like 
( student ,Professional and others). The result of survey 
shown in graphs. The questionnaire designed on EPay- 
ment System .Five points like Agree, Disagree, Strongly 
disagree, Strongly agree, Neutral.A survey Questionnaire 
is enclosed in table. 
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2) Data Analysis: 
The data collected were analyzed for the entire sample. 
3) Result: 
This is a descriptive research which has studied the 
present conditions. The relevant data was collected based 
on e-payment system and which epayment type of most 
suitable. 


VII. DATA INTERPRETATION 


Studies have been carried out on E-Payment system. Ques- 
tions are related to E-Payment system in which given options 
are Agree, Disagree,Strongly disagree,Strongly agree,Neutral. 


TABLE III 
OVERALL ANALYSIS OF E-PAYMENT SYSTEM ON THE BASIS OF SURVEY 


Strongly 
disagree 
1.E-payment 98 12 8 62 20 
systems 
saves your 


time and 
money 


2.Problem 65 27 29 70 9 
will not arise 

if our debit 

card lost or 

stolen 


3.E-payment 64 42 27 39 12 
system can 

be easily 

understood 

and readily 

adopted 


4.ltis 73 47 49 23 8 
reliable 
service 


IX. CONCLUSION 


In conclusion, there are a wide variety of payment systems 
available to a consumer today. However there arises a need to 
provide a single universal payment system that provides the 
advantages of all the existing payment system. 

Electronic payment industry has an extensive potential for 
growth considering the growth of internet.we should take 
advantage of this and make the best use of available tech- 
nology for the better of mankind. There are a wide variety 
of payment systems available to a consumer today. However 
there arises a need to provide a single universal payment 
system that provides the advantages of all the existing payment 
system.the transaction of e payment system may be simple 
or complex.simple transaction using magnetic strip cards,in 
which customer details are exchanged for goods or services 
and to more complex systems where an online puechasing can 
debit existing bank accounts of the purchaser and credit bank 
account of the seller. 

The applicability of the new institutional theory to study 
developing-country environmental effects on e-business inno- 
vation transfers. It argues that developing-country e-business 
researchers can draw on the new institutional theory to explain 
contextual effects on technology transfers as well as contribute 


to relevant policies.Electronic transfer funds have been around 
for many years and the economy has greatly benefited from 
its technogical advance .with new innovations and proper us- 
age,financial technology can be the key to suessfully managing 
one’s money. 

A study in Bangladesh found employees’ satisfaction with 
wage payments being made into an account increased over 
time. Workers reported not wanting to go back to cash 
payments. Digital payments can also be more secure for 
employees than manual cash payments, which can be more 
easily stolen or misappropriated. While security is always a 
concern when traveling with large amounts of cash, this is 
especially salient with respect to regular cash payments—such 
as wages—that are received at publicly known locations and 
points in time. 

There are some challenges to this such as cyber attacking 
etc., but the innovations in digital payments makes it more 
suitable for entrepreneurs and the banks providing proper 
awareness about the pin numbers; and what to do if something 
goes wrong also helps to overcome from the cyber attacks 
and other attacks. Successful innovations in digital payment 
systems have shown that entrepreneurs and employees adjust 
rapidly to their introduction, quickly gaining competence and 
comfort when using appropriately designed, convenient, and 
efficient systems. 
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A Survey on Fog Computing 
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Abstract—Fog computing is a computing architecture in which 
a series of nodes receive data from IOT device in real time. This 
node perform real time processing of the data. That they receive 
with millisecond response time. It requires high speed connectiv- 
ity between IOT device and nodes. A fog computing architecture 
based on container virtualization and SDN functionality it is 
a lightweight Linux container virtualization provided by docker 
framework. Architectures based on fog comuting container virtu- 
alization and SDN functionality, blockchain functionality, Multi- 
tier architecture, ETSI-NFV architecture, Pub/ Sub architecture, 
ERDMS & AMFC. Various applications are fog computing for 
augmented reality, Fog Guru, fog service for disaster response. 

Index Terms—Fog computing, Pub-Sub based fog computing, 
ETSI-NFV architecture, multi tier architecture, ERDMS, fog in 
AR, Fog Guru, emergency fog service for disaster response. 


I. INTRODUCTION 


OG computing is a decentralized computing infrastruc- 
ture in which data, compute, storage and applications 


are located somewhere between the data source and 
the cloud. Like edge computing, fog computing brings the 
advantages and power of the cloud closer to where data 
is created and acted upon. Many people use the terms fog 
computing and edge computing interchangeably because both 
involve bringing intelligence and processing closer to where 
the data is created. This is often done to improve efficiency, 
though it might also be done for security and compliance 
reasons. Video streaming services have stringent requirements 
such as a good-quality communication channel as well as a 
steady and uninterrupted flow of information. Because of that, 
the employment of such services in a fog/cloud environments 
have attractive advantages to improve the end users Quality 
of Experience (QoE). Examples of services include video 
transcoding, multiple route video transmission, and cache 
schemes. A transcoder service can be used to transcode a video 
with a bit rate of 8 Mbps (1080p) to 5 Mbps (720p), with no 
visible loss in quality, if the end-user device is not ready to 
display videos in 1080p. 

Fog computing provides a technical basis for innovative 
application scenarios of the blockchain technology. IOT is 
internet of things. It describes a fundamental paradigm shift 
enhancing previously analog devices with effective computing 
and networking capabilities. Block chain technology is called 
bitcoin. It supports simple stack-based language which allows 
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Fig. 1. Fog computing 


End Devices 


limited degree programmability. It is introduced to realize truly 
decentralized digital currency for its users in p2p interaction 
mode without centralized authority. It deals with transaction. 
A transaction have atleast one input. The video streaming 
services are already responsible for the majority of the Internet 
traffic. A good cloud-level architecture partially solves some 
issues related to the video streaming services. In order to 
reduce the traffic of networks and servers in the IoT, types 
of the fog computing (FC) models are proposed, which are 
composed of fog nodes. A fog node supports application pro- 
cesses to calculate output data on sensor data and forward the 
output data to servers. A topic-based PS (publish/subscribe) 
model is a new contents-aware, event-driven model of a 
distributed system. Here, a process publishes a message whose 
contents are denoted by publication topics. A process specifies 
subscription topics and only is delivered messages whose pub- 
lication topics include some of the subscription topics. In our 
previous studies, the MPSFC (Mobile PS (publish/subscribe) 
Fog Computing) model is proposed where mobile fog nodes 
like vehicles are interconnected and fog nodes communicate 
with one another in wireless networks by the PS model. 
Fog Computing platforms can provide services with reduced 
latency and improved QoS. Thus it is becoming an impor- 
tant enabler for consumer centric Internet of Things based 
applications and services that require real time operations e.g. 
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connected vehicles, smart road intersection management and 
smart grid. M2M data processing with semantics, discovery 
and management of connected vehicles are briefly discussed 
as consumer centric IoT services enabled by the distinct 
characteristics of Fog Computing. 

Emergency management systems are being developed using 
innovative technologies such as Internet of Things (IoT), 5G, 
cloud and fog computing. These new technologies help to save 
response time from authorities in crucial situations. Shorter 
response times can save many lives during accidents and other 
emergency situation. In most of the accident management 
systems, cloud services are used to get information and 
notifying the emergency managing authorities such as hospital, 
ambulance personnel and police. Although, cloud servers can 
have detailed information needed in such situations, some 
times there can be connectivity and information travel time 
problems. These issues can be fatal in emergency situations 
where quick and prompt response is generally required. 


II. FOG COMPUTING ARCHITECTURES 


Fog computing is a computing architecture in which a series 
of nodes receive data from IOT device in real time. This 
node perform real time processing of the data. That they 
receive with millisecond response time. It requires high speed 
connectivity between IOT device and nodes. Fog computing is 
also called fog networking/fogging. Fog architecture extends 
services offered by the cloud to the edge devices such as 
routers, switches or WAN devices. Fog does not work on a 
cloud it works on a network edge so it is faster. Millions 
of devices are nowadays connected, so hackers find many 
entry points for attack and damage the data. Fog distributed 
architecture safeguards connected system from cloud to device 
creating an additional layer for security. 


Cloud layer 

* datacenter 
HW/SW 

* laa$, PaaS, Faas, 
SaaS services 

* VM allocation 


Fog cell layer 
* fog gateways 
and controller 
nodes of fog 
cell clusters 


* smart loT 
devices 


Fig. 2. Architecture of the fog computing environment 


A. A Fog Computing Architecture Based on Container Virtu- 
alization and SDN Functionality 


The versatile fog gateway HCL-BaFog with its data and 
control planes are shown in Fig. 2. It is a lightweight Linux 


Fig. 3. Hierarchical protocol layering of a fog gateway 


container virtualization provided by docker framework. Com- 
puter systems based 0n32-bit or 64-bit multi-core ARM pro- 
cessor architecture with HCL(Hypriot Cluster Lab). It provides 
required hardware and software functionality of a powerful, 
low cost of computing node. It supported by continuous inte- 
gration approach and corresponding deployment platforms[1]. 


B. Architecture of a Fog Node with Blockchain Functionality 


Fog computing architecture with integrated blockchain tech- 
nology is the combination of fog gateway HCL-BaFog and 
Multichain framework. 


Sensors 


Fig. 4. System architecture of a Multichain node exposing a custom-built 
public API to access the internal Multichain daemon and its remote procedure 
call interface. 


e Structure of Multi Chain Node Multichain is designed 
for a permissioned environment. It works with eight per- 
mission capabilities that are granted to address: Connect, 
Receive, Issue, Activate, Create, Admin, Mine, Send. 
All these had default values. Mining is the process of 
appending new blocks to the blockchain. The miner 
solves block along with its solution to network. The 
receiver validate solution and all its transaction. If all 
tests passed, block appended to the peers and process 
starts again. 

e Sharing of Confidential Sensor Data 
It is fundamental property of all blockchains that its re- 
lated data content is public. That is readable by everyone. 
An access protocol to share confidential data in a enhance 
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fog computing environment has been implemented. Here, 
identities and access are used to facilitate emission of 
confidential data to a peer of blockchain. Data is a stream 
used as optional. The identity stream maps a label to a 
public RSA key. Access is used to deposit decryption 
password for the receiver of data item. The content of an 
access item is encrypted by the receiver’s public key. It 
contains tuple where first value is a pointer to the location 
of transmitted data and second value is AES encryption 
password. Blockchain provide good method to provide 
message authentication. API’s offered ‘signmessage’ and 
‘verifymessage’ for sign data and verify signature. 
e Sharing of Real-Time Sensor Data 

The simplest mechanism to store data directly on the 
blockchain is provided by an attachment of the data to 
transactions. Then, the data become part of the immutable 
blockchain history and are stored by all peers of network. 
The storing data directly leads to massive growth in 
size. Multichain offers an integrated solution to indirectly 
store data on the blockchain. It extracts data, split it into 
smaller chunks of 1 MB in size by default. 


C. Multi-tier Architecture 


Top tier composed of cloud servers. It can be located in pub- 
lic/private cloud. Other 3 layers are responsible for fog/edge 
network. In multi-tier ecosystem, core network regional edge 
could handle[3]. Eg:-Base Band Unit (BBU), Internet Service 
Provider (ISP). Access Edge network support few dozens of 
local nodes in fog, represented by Base Station or Access 
Point. Edge Gateways, it can be distributed to local mist nodes 
such as PC, laptops and smartphones. Where, node delegates 
video content over wireless connections. End devices have 
both high and similar traffic demands, being able to cooperate 
with each other. 
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Fig. 5. Overview of the Hierarchical Fog Cloud Environment 


D. ETSI-NFV Architecture 


[ETSI: European Telecommunication Standards Institute, 
NFV: Network Function Virtualization] Video streaming is 
composed of different services. All these services become 


virtualized and deployed as VNF(Virtual Network Func- 
tion)NFVO includes creating, monitoring, and chaining net- 
work services. VNFM is responsible for handling VNF in- 
stances, coordinating requests between the VNF instance 
and related network modules management systems. VIM 
controls and manages NFV infrastructure, which includes 
computing, storage, and network resources. It is visible for 
IaaS(Infrastructure as a service). Virtual Infrastructure Man- 
ager is the glue between hardware and software resources. 


E. A Pub/Sub Based Fog Computing Architecture for Internet 
of Vehicles 


The pub/sub architecture satisfies low latency & geo distri- 
bution of fog nodes, the interaction between fog nodes is topic 
based. There is a Publisher and a Subscriber, publisher issues 
a message with a particular topic and subscriber receives the 
message if it is interested. Only the subscribing fog nodes 
can accept and receive information. Each fog node collects 
the data from cars, STL and detects the details like direction, 
accident information etc. This information is passed to a 
layered process[2]. 
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Fig. 6. The proposed architecture in a fog computing. 


1) Layers 


e First layer - Immediate Processor: This layer captures 
event patters from gathered information. 

e Second layer: This includes Semantic Labeller, Event 
Matcher, Condition Evaluator, Action Scheduler, 
Knowledge Propagator. 


2) Domain Ontology 
Ontology is a set of concepts and categories in a domain 
that show the properties and realtions.Domain Ontology 
includes the elements Entity Class, Resource Class, Ser- 
vice Class,Event Class, Event Pattern, Spatial Temporal 
Event Pattern. 

) Active Rules for EVENT-CONDITION-ACTION 
It actually means that when an event occurs a condition 
is evaluated and if the condition is satisfied an action is 
executed. 


Ww 
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F. An Architecture for Fog Computing Enabled Emergency 
Response Disaster management System(ERDMS) & Accident 
Management System 


Cloud-based technologies have been widely used to make 
decisions under such circumstances. Though it has many ad- 
vantages, the main problem is latency and location awareness. 
To solve such problems, fog computing or edge computing 
plays a vital role. Main objective is to minimize the emergency 
response time[4][5]. It allows to perform data processing 
near the source, which results in Quick response and Reduce 
latency. 


1) Layers 
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Fig. 7. ERDMS System Architecture 


Interface layer provide interfaces for all car users, hospital 
interface, Ambulance interface. Database layer has the 
family network information of the vehicle owner[4][5]. 
Service layer utilizes uses the family network information 
and notifies family network by using SMS API from 
service layer. Once accident is detected, Google map API 
is used to send the accident site and route to ambulance 
driver. Smart Device Layer is associated with actuators 
or sensors with wireless communication capability. Fog 
Node Layer supports a multitude of access technologies, 
acts as a gateway between a sensor network and the 
local/remote cloud facilities, processes the reported data 
and make decisions before directing traffic to cloud 
layer. Cloud Layer is a repository or platform generally 
that handles data analytic, data warehouse by providing 
abundant resources and traffic management unit (TMU) 
having Global positioning capability. 
2) System Flow 
Processing of data will be performed at the fog nodes to 
reduce latency. It will play a key role in the detection of 
accident and processing of data. It will gather the relevant 
data using android sensors[5]. Data will be continuously 
received, monitored, and processed on fog node. 
3) System View 
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Fig. 8. ERDMS System Flow 


The system view provides specification about the system 
flow. The complete system flow is presented in this figure. 
The proposed system is divided into 2 parts: 


e Accident Detection phase: 3 smartphone sensors are 
used accelerometer, GPS sensor, and microphone for 
noise detection. Only the value of accelerometer is 
not enough to detect an accident[5]. 

e Emergency response and notification phase: Main 
objective is, fast response time along with the right 
type of information. The Android application uses 
the information provided to everyone. 


III. FOG COMPUTING APPLICATIONS 


A. Fog Computing for Augmented Reality 


Augmented reality applications are computationally inten- 
sive and have latency requirements in the range of 15-20 mil- 
liseconds[7]. Fog computing addresses these requirements by 
providing on-demand computing capacity and lower latency by 
bringing the computational resources closer to the augmented 
reality devices. Augmented reality (AR) systems enhance the 
view of the real world by overlaying context-specific infor- 
mation on top of the real world. The advantages of using AR 
systems in industrial contexts have been investigated, such as 
in shipyards, heavy machinery, remote assisted maintenance, 
and industrial human-robot collaboration. 

Fog-based applications are designed considering the hier- 
archical and layered fog computing architecture, as shown 
in Fig. 8. Here, the latency sensitive components of the 
application are executed closer to devices, while the layers 
closer to, and, including the cloud, are reserved mostly for data 
storage, monitoring, and coordination of the layers lower in 
the hierarchy. It provided a component-based middleware for 
a cloudlet architecture and investigated the usefulness of such 
an architecture for AR applications. They considered resources 
such as laptops and mobile devices within the local network 
to be a cloudlet along with the computing resources in the 
cloud. 
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The architecture was extended by adding support for syn- 
chronization and shared off-loading. The synchronization al- 
lows maintaining a consistent application state across the 
collaborating devices while with shared off-loading, certain 
components that are common to all devices, are off-loaded to a 
cloudlet. To support a collaborative, multi-user AR application 
scenario, the cloudlets communicate with other cloudlets. 
Here, end devices are connected to a base station that is 
equipped with a cloudlet. 
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Fig. 9. Layered Architecture. 


The three-layered architecture for AR applications cus- 
tomized for a shipyard. Here, end devices are parts of a 
local network connected via a wireless access point and they 
communicate with a so-called “local edge layer gateway”, 
which provides computing capacity to AR devices within the 
local network. Additionally, these edge layer gateways form 
a network among themselves to enable collaboration between 
remote AR devices. A cloudlet is also added to these gateway 
devices in the edge layer. Additionally, it is also assumed that 
it will to provide more computational resources compared to 
the local gateways. 


B. Energy Optimization 


For the wearable AR devices, fog computing provides 
the opportunity to off-load parts of these computationally 
demanding tasks, thus reducing their energy consumption. The 
solution for energy optimization when multiple AR devices are 
interacting with a shared application. The idea is, by utilizing 
mobile edge computing, each device involved in collaborative 
AR applications does not need to send all data by itself. 
Instead, each device simply needs to send a fraction of relevant 
data to the cloudlet, and then the cloudlet simply returns a 
fraction of data that is relevant to specific devices only. 


C. Latency Optimization 


The recommended frame rate for AR applications in head 
mounted devices is 60 fps, which means it requires the latency 
to be less than 17 milliseconds. Another recommendation is 


to have a consistent frame rate, for example, 30 fps over the 
complete duration is much better than varying frame rates 
within short intervals. 


e Latency as a Constraint 
The approach introduces transmission delay between the 
client device and the edge cloud, compared to local 
execution, they showed that the time saved from off- 
loading the computing to the edge is still higher than the 
transmission delay. Moreover, compressing data requires 
more time for both encoding and decoding the data, but 
the results showed that transmitting this compressed data 
still produced lower latency than transmitting uncom- 
pressed data. All in all, for sending, tracking, annotating, 
and receiving a 752x480 resolution video compressed 
according to JPEG, the end-to-end latency was about 50 
milliseconds. By this it was concluded that this latency is 
acceptable for a seamless AR experience using hand-held 
devices, but it is too high for a head-mounted devices. 

e Minimization Function 
The investigation collocated encoding and rendering as 
well as split encoding and rendering of the augmented 
video for transmission and display. To address both the 
task assignment and the off-loading problem, The formu- 
lation of a multi-objective problem using the weighted 
sum method, which aims to minimize latency and maxi- 
mize video quality. The non-linear integer programming 
problem was solved using a block coordinate descent 
method based algorithm. 

e Combined Techniques 
A combination of techniques to minimize as well as to 
hide latency in order to achieve 60 fps. The techniques 
involve a “dynamic region of interest (RoI) encoding” 
scheme that reduces the bandwidth consumption by de- 
creasing the quality of encoding of certain regions in 
the frame that may not necessarily contain any useful 
information. This is coupled with a “parallel streaming 
and inference” technique, where inference begins on 
“slices of a frame”. 


D. Applications 


Due to the limited computing capacity, AR devices cannot 
be used for reinforcement learning. Instead, fog computing is 
used for training the algorithm and collecting environmental 
information that might be necessary for AR systems. There- 
fore, the AR system simply retrieves information computed by 
fog nodes. The reinforcement learning is used to prevent AR 
applications showing things that might be harmful for the user 
by using adaptive policies, for example, virtual objects should 
not be shown in a way that occludes important objects. 

The investigation on the use of fog nodes as a platform for 
training imitation learning algorithm, which is another ma- 
chine learning technique. The imitation learning was used for 
supporting personalization in AR applications, which learns 
about user’s preferences on where overlaid objects should be 
displayed on the environment. The imitation learning consists 
of two parts: (i) teacher agent, which is controlled by the user, 
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and (ii) student agent, which automatically captures the data 
from the teacher agent. After several training sessions that 
were conducted in a virtual-reality environment, the agent is 
able to learn user’s preferred position of overlaid objects and 
where the physical trajectory where it was taken. In addition, 
both accuracy and precision are also improved along as the 
number of training sessions is increasing. 


IV. FOG GURU: A FOG COMPUTING PLATFORM BASED ON 
APACHE FLINK 


Fog Computing could enable highly adaptive deployments 
of services, including support for programming at the software 
and hardware layers. This work aimed to design and develop a 
Fog platform, called FogGuru, for facilitating the development 
and deployment of Fog applications[9]. The prototype built 
makes use of the real-time Stream Processing Engine (SPE) 
Apache Flink, which is provided as an image ready to be 
deployed on resource-constrained devices. 


A. Hardware Stack 


In principle, any device with computing, storage, and net- 
work connectivity could act as a fog node [3]. Additionally, 
fog nodes should be able to be distributed geographically, to 
cope with different network types, to be cheap and easy to 
replicate. Here choose, to work with Raspberry Pi 3b+ single- 
board computers as standard devices. 


B. Platform Architecture 


The FogGuru platform high-level architecture is represented 
in Fig. 9. It is consistent with standard architectures in 
edge/fog computing and in data processing pipelines. Data 
from the IoT tier gets ingested through a suitable queuing 
system, from where it is fetched to be processed. 
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Fig. 10. The FogGuru Platform architecture 


e Mssaging System 
Message Queue Telemetry Transport (MQTT) is a 
lightweight publish-subscribe network protocol widely 
used in IoT applications, and better suited for fog ap- 
plications than more powerful cloud frameworks. 

e Processing 
Stream Processing Engines (SPEs) represent a good 
paradigm for Fog computing applications because of their 


extensive set of features, including support for event 
driven, data pipeline, and data analytics applications. 

e Monitoring 
We used Prometheus3 and Grafana4 for computing real- 
time performance indicators based on Apache Flink met- 
rics APIs. 

e Containerization 
Using container orchestration is mandatory for deploying 
fog applications at scale. 


V. A FAST DEPLOYMENT STRATEGY OF EMERGENCY FOG 
SERVICE FOR DISASTER RESPONSE 


As shown in Fig. 10, within a square open area, blue 
triangles are base stations that can work while red ones are 
damaged (or power shortage) and cannot be used anymore. 
Blue circles as signal ranges together cover green crosses as 
user equipment can connect to these base stations. The first 
task is to help the red crosses as user equipment to reconnect 
into the circles again by fog nodes. Suppose users all stay 
still or only move in a fixed range and wait for rescue. At 
this time, first, seek help from the nearby end devices that can 
serve as fog nodes. That is, make use of the nearby fog nodes 
to build an early-stage emergency fog network. After this step, 
some of the lost user devices may get connected again to base 
stations while the other ones may form clusters. Even some 
users are still alone. As a result, in the next step, our task is to 
cover all the clusters and users by deploying new fog nodes in 
this area. Find number of user equipments in a disconnected 
state, fog nodes to realize emergency networking, and a queue 
to record the forwarding from UE to BS. The first step is to 
start a polling loop for each ue out of coverage to choose a 
neighbour as a pair, then place a fn at the midpoint of this 
pair of users. At this time, traverse the other ue to check if 
any other ones can connect to this fn. After the placement of 
new fn, the next step going to test the routing from ue to any 
of the bs through placed fn[8]. The output of this algorithm is 
the number of ue being reconnected and the total latency in 
emergency routing. 


VI. CONCLUSION 


This survey aims to address about the concept of fog 
computing, it’s architectures and applications. Fog computing 
considered a good partner of cloud computing which extends 
the services of a cloud to the end user. The characteristic of fog 
computing such as mobility, place close to the end user, loca- 
tion awareness, heterogeneity and their real-time applications, 
fog computing paradigm is a more suitable platform for the 
internet of things. The major functionality includes lowering 
the latency rate, improving the security system and creating a 
smart world in terms of networking. 

Many researches and studies held based on fog comput- 
ing. Architectures discussed are Multitier, Pub-Sub Model, 
Emergency Response Disaster management System(ERDMS), 
Accident Management System Using Fog(AMFC), ETSI-NFV 
architecture [ETSI: European Telecommunication Standards 
Institute, NFV: Network Function Virtualization], architecture 
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Fig. 11. An example of affected area after disaster 


based on container virtualization and SDN functionality, Ar- 
chitecture of a fog node with blockchain functionality. AMFC 
& ERDMS are very useful for detecting accidents, natural 
disasters and they respond quickly. Using pub/sub architecture, 
the fog nodes can work together and respond to the same event 
that they are subscribed to, it means that that an event can be 
propagated to all fog nodes, only the subscribing nodes can 
accept and receive information. 

The applications of fog computing are Fog Computing in 
Augmented Reality(AR) and Fog Computing platform based 
on Apache Flink etc. Fog computing is used for training 
the algorithm and collecting environmental information that 
might be necessary for AR systems. Therefore, the AR system 
simply retrieves information computed by fog nodes. The Fog 
Computing platform based on Apache Flink is is consistent 
with standard architectures in edge/fog computing and in data 
processing pipelines. 

Some of the drawbacks of fog computing are its complexity, 
there are many devices located at different locations storing 
and analyzing their own set of data. This could add more 
complexity to the network. There are chances for these fog 
nodes to be in a less secure environment. Hackers can easily 
impose fake IP address in them gaining access to the respective 
fog node. Fog nodes require high amount of energy for them 
to function. As there are more fog nodes in a fog infrastructure 
there are more power consumption as well. Many studies and 
researches are required in this field to reduce the drawbacks 
such as security issues and power consumption issues. 
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Abstract—In recent years, the use of online social networks 
(OSNs) such as Facebook, Twitter etc has tremendous increased. 
These OSNs offer attractive means of online social interactions 
and communications, but also raise and security concerns. In 
this paper we are discussing about a current access control 
mechanisms employed by OSNs to protect private information 
shared among users of OSNs. The proposed approach presents 
a system of collaborative content management that relies on an 
extended notion of a “content stakeholder.’ A tool, Collaborative 
Privacy Management (CoPE), is implemented as an application 
within a popular social networking site, facebook.com, to ensure 
the protection of shared images generated by users. And we 
extend the research to present a thorough review of the different 
security and privacy risks, which threaten the well-being of OSN 
users. In addition, we present an overview of existing solutions 
that can provide better protection, security, and privacy for OSN 
users. 

Index Terms—Online social networks (OSNs), collaborative 
privacy management (CoPE), design of CoPE, access control 
system and policy composition, system architecture of CoPE, 
resolving privacy conflict, conflict resolution. 


I. INTRODUCTION 


NLINE social networks (OSNs), such as Facebook, 

Twitter, and Google+, have become an actual portal 

for hundreds of millions of Internet users. For exam- 
ple, Facebook, one of representative social network provider, 
claims that it has more than 800 million active users. With 
the help of these OSNs, people share personal and public 
information and make social connections with friends, cowork- 
ers, colleagues, family and even with strangers. As a result, 
OSNs store a huge amount of possibly sensitive and private 
information on users and their interactions. Regardless of 
the purpose of an OSN, one of the main motivations for 
users to join an OSN, create a profile, and use different 
applications offered by the OSN is the possibility to easily 
share information with selected contacts or the public, and 
facilitate social interactions between the users of OSNs. On 
the other hand, leakage of personal information, especially 
one’s identity, may invite malicious attacks from the real 
world and cyberspace, such as stalking, reputation slander, 
personalized spamming, and phishing. Despite the risks, many 
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Fig. 1. Privacy in social media 


of the privacy and access control mechanisms of today’s OSNs 
are purposefully weak to make joining the OSN and sharing 
information easy. We believe that more effective and flexible 
security mechanisms are therefore required for the safety of 
OSN users as well as the continued thriving of OSNs. 


II. ONLINE SOCIAL NETWORK USAGE 


Today many OSNs have tens of millions of registered users. 
Facebook, with more than a billion active users, is currently 
the largest and most popular OSN in the world. Other well- 
known OSNs are Google+, with over 235 million active users; 
Twitter, with over 200 million active users; and LinkedIn, with 
more than 160 million active users. 

72% of online American adults use social networking 
sites, a dramatic increase from the 2005 Pew survey which 
discovered that just 8% of online American adults used social 
networking sites. Moreover, the survey revealed that 89% of 
online American adults between the ages of 18 to 29 use 
social network sites, while in 2005 only 9%of the survey 
participants in this age group used this type of site. These 
survey results are compatible with a previous report published 
by Nielsen in 2011, disclosing that Americans spent 22.5% 
of their online time on OSNs and blogs, more than twice the 
time spent on online games (9.8%). Other common activities 
that consume Americans’ online time include email (7.6%), 
portals (4.5%), videos and movies (4.4%), searches (4.0%), 
and instant messaging (3.3%). 
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U.S. users spent a total of 53.5 billion minutes on Facebook 
during May 2011, 17.2 billion minutes on Yahoo, and 12.5 
billion minutes on Google. Besides being popular among 
adults, OSNs have become extremely popular with young 
children and teenagers. 60% of children 9 to 16 years old who 
access the Internet use it daily (88 minutes of use on average) 
and 59% of those 9 to 16 years old who use the Internet have 
a personal OSN site profile (26% of ages 9 to 10; 49% of ages 
11 to 12; 73% of ages 13 to 14; 82% of ages 15 to 16). 

The use of OSNs is embedded in the everyday lives of 
young children and teenagers, and can result in personal 
information being exposed, misused, and potentially abused. 


III. COPE: ENABLING COLLABORATIVE PRIVACY 
MANAGEMENT IN ONLINE SOCIAL NETWORKS 


Collaborative Privacy Management (CoPE), is implemented 
as an application within a popular social networking site, 
facebook.com, to ensure the protection of shared images 
generated by users. 

We present a user study of our CoPE tool through a survey- 
based study (n =80).The results demonstrate that regardless 
of whether Facebook users are worried about their privacy, 
they like the idea of collaborative privacy management and 
believe that a tool such as CoPE would be useful to manage 
their personal information shared within a social network. For 
example, most social networking services (e.g., facebook.com) 
allow users to create content that may connect with their 
friends’ identities (e.g., uploading an image about a friend, 
tagging a friend in an image, or linking to a friend’s personal 
profile). Such collaborative activities raise a new set of privacy 
challenges because a person’s private information can be easily 
revealed in content created by others. In other words, in the 
context of OSNs, private information will not only reside in 
a single user’s own domain but also be co-owned and co- 
managed by other stakeholders (e.g., friends who upload or 
comment on an image). Thus, the task of privacy management 
has to involve other stakeholders in a collaborative fashion. 
Here, we develop a system named Collaborative Privacy 
Management (CoPE) to support users’ collaborative privacy 
management. To assess the validity of our proposed access 
control model, we implement and evaluate a CoPE system 
within Facebook. 


A. Design of CoPE Access Control System and Policy Com- 
position 


We aim to design the CoPE system as an integrated solution 
that provides users with privacy mechanisms to collaboratively 
protect and manage accessibility of their published images in 
OSNs. The advantages of CoPE are two fold. 

First, it allows users to prevent unauthorized access to their 
personal data by providing a high-level of control over other 
users’ access rights. Second, all stakeholders are given the 
ability to jointly manage shared images and mutually benefit 
from the control features offered by the tool. While we test 
and design our model focusing on images, our approach can 


be generalized to deal with other content and identify stake- 
holders. Further, the general model proposed in this article can 
be applied to a number of different systems, characterized by 
different content-sharing features. 


B. System Architecture of CoPE 


Our CoPE tool was implemented under a client-server 
architecture using an Apache Tomcat application server with 
PHP, and an MySQL 5.0.22. Database server. The Tomcat 
application server is responsible for the data processing and 
management of shared content, user profiles, and shared 
profiles. The CoPE is implemented as a Facebook application. 
As each Facebook application, CoPE owns its own unique 
appkey and secret keys that are used to enable access to 
the Facebook platform. The application is made to run in 
an iframe, and support for Facebook Markup Language is 
enabled. The application settings are customized such that all 
users can add and use the application. 

The application includes several PHP files, which pro- 
cess user authentication, user interface layout, shared content 
management, co-ownership management, friend management, 
privacy policy management, and so on. User authentication is 
integrated with the authentication of Facebook. Once the user 
installs the application in his or her Facebook profile page, the 
application file “PrivateBoxAlbum” accesses the user’s profile 
data, and uses such information to complete the application 
template. In particular, the application imports the user image 
files and renders them from the CoPE. Image files are locally 
stored in and managed by the MySQL database server. 

Upon the user opening the CoPE, photos added by the 
current user are retrieved from the database. The list of 
users who have been tagged in the images is retrieved, 
using Facebook-specific methods [e.g., photo.get.Tags()]. The 
system is then in charge to remove possible duplicates (i.e., 
some users may share multiple files) and to create a unique list 
of stakeholder per each profile. Once the list is identified, the 
notification process is started by leveraging the notification 
systems provided by Facebook. Upon stakeholders entering 
their privacy preferences (under the settings.php file), the 
system tracks them and computes the privacy settings resulting 
from the users’ input. Once a common policy is formed, the 
visibility of the image changes, and the corresponding access 
policy is enforced. In detail, this is achieved by carrying out 
the following tasks: 

e First, the values of the settings of the currents users are 

retrieved (from settings.php). 

e Second, the photo id’s images that the current user is 
sharing are saved. 

e Third, for all such images, the corresponding privacy- 
setting changes are collected. 

e Fourth, once all these settings are collected, the system 
composes the policy, including and excluding viewers 
according to the criteria indicated by users. 

This policy is added to the settings database for each image 

so that each time an image is invoked by the application 
file in charge of rendering the image, the correct settings are 
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applied. If a user fails to provide a preference, the default 
preference is applied until further modification. Note that 
although our application was implemented with the APIs 
provided by Facebook, it can be easily migrated to other OSN 
platforms. The APIs our implementation relies on to access 
the tags of a picture and users are commonly seen in different 
ONS platforms. 


IV. RESOLVING PRIVACY CONFLICT 


While OSNs allow a single user to manage access to her/his 
data, those currently do not provide any mechanism to apply 
privacy concern over data associated with multiple users, 
remaining privacy harms largely unresolved and leading to 
the potential confession of information that at least one user 
planned to keep private. This is because the unrestricted image 
of a subject can be affected by photos or comments posted on 
a social network. In this way, recent studies shows that users 
are demanding better mechanisms to protect their privacy. For 
this concern, we provide a systematic mechanism to identify 
and resolve privacy conflicts in online social networks (OSNs). 
The first computational tool to resolve conflicts for multiparty 
privacy management in social media. This makes it enable to 
adapt different situations by modelling the concessions that 
users make to reach a solution to the conflicts. Our conflict 
resolution specifies a tradeoff between privacy protection and 
data sharing by computing privacy risk and sharing loss. 


A. Conflict Resolution 


When two users disagree on whom the shared data item 
should be visible to, we say a privacy conflict occurs. The 
use of a mediator which detects conflicts, suggests a possible 
solution to them. Probably in most social media infrastruc- 
tures, such as Facebook, Twitter, Google+ and the like, this 
mediator could be integrated as the back-end of SM privacy 
controls’ interface; or it could be implemented as a social 
media application such as a Facebook application that works 
as an edge to the privacy controls of the original social media 
organization. The mediator inspects the individual privacy 
policies of all users for the item and flags all the conflicts 
found. Basically, it looks at whether individual privacy policies 
suggest conflicting access control decisions for the same target 
user. If conflicts are found the item is not shared defensively. 
The mediator proposes a solution for each conflict found. The 
mediator estimates how willing each assigning user may be to 
grant by considering: his/her individual privacy preferences, 
how sensitive the particular item is for his/her, and the relative 
importance of the conflicting target users for his/her. 


B. Computing Conflict Resolution 


Negotiations about privacy in social media are shared most 
of the time. That is, users would consider other’s inclinations 
when deciding to whom they share, so users may be willing 
to grant and change their initial most preferred option. Being 
able to model the situations in which these concessions occur 
is of crucial importance to propose the best solution to the 


conflicts found one that would be acceptable by all the users 
involved. 

Following are user rule for privacy preference of each user, 
these user rules are used for conflict resolve: 


e I do not mind (IDM) rule: 
In this rule, if any user want to upload own item in the 
network at that time another user have no objection on 
that. 

e I understand (IU) rule: 
In this rule, if one user want to share the photo and 
another one can’t share that photo at that time the users 
do not want to cause any harm to their friends and will 
normally listen to their objections. 

e No concession (NC) rule: 
For the other cases in which neither IDM nor IU applies, 
then the mediator estimates that a negotiating user would 
not grant and would prefer to stick to his/her preferred 
action for the conflicting target user. 


The mediator computes the solution for each conflict found 
by applying the concession rules defined above. If it is not 
conflicting, the mediator assigns to this target user the action 
shared by all negotiation users. If t is conflicting, the mediator 
assigns to its proposal to solve the conflict. 

In particular, for each conflicting target user: If for all 
negotiating users, their willingness to accept changing their 
preferred action for the conflicting target user is high, then, 
according to concession rule IDM, the mediator assumes that 
all users are willing to grant if need be, so that the final action 
to be applied for target user t can be both grating and denying. 
In order to select one of these two actions, the mediator runs a 
modified majority voting rule. In precise, this function selects 
the action that is most favored by the majority of users. In case 
that there is a tie i.e., the number of users who wish granting 
and the number of users who wish denying is the same, then 
the uploaded is given an extra vote. 

Though, the specific concession rule instantiated in each 
situation are rest on the individual privacy policy of each 
user, the sensitivity of the item for the user, and the relative 
importance of the conflicting target user, which may vary from 
participant to participant. 

e Definition of the Individual Privacy Policy: 

Each participant was asked to define her/his most pre- 
ferred privacy policy for each photo. 

e Conflict and Concession Question: 

Once the participants defined their individual privacy 
policy for the photo, a conflict was generated. That is, 
we told the participants that one or more of the other 
people in the photo had a different most preferred action 
for one particular person, specifying the relationship type 
and strength the participant would have to this person. For 
instance, if the participant only needed to share the photo 
with close friends, we told her/him that the other people 
in the photo needed to share the photo with someone that 
was her/his friend. Where multiple options were available 
to generate a conflict, we chose one of them randomly. 
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II. Threats to Online Social Networks Users 


III-A. Classic Threats 
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Cross-Site Scripting Fake Profiles 
(XSS) (Socialbots) 


Internet Fraud | Identity Clone Attacks 


Inference Attacks 


Information Leakage 


Location Leakage | 


Socware 


Online Predators | 
Risky Behaviors | 
Cyberbullying | 


| II-C. Combination III-D. Threats Targeting 
II-B. Modern Threats | Teds Children 


Fig. 2. Threats to online social network users 


Then, we asked participants whether or not they would 
concede and change their most preferred action for that 
person to solve the conflict with the other people depicted 


in the photo. . 


V. THREATS IN ONLINE SOCIAL NETWORKS 


With the increasing usage of OSNs, many users have un- 
knowingly become exposed to threats both to their privacy and 
to their security.These threats can be divided into categories. 


A. Classic Threats 


Classic threats have been a problem ever since the Internet 
gained widespread usage. Different types of classic threats are 
listed below: 


e Malware: 
Malware is malicious software developed to disrupt a 
computer operation in order to collect a user’s credentials 


Phishing attacks are a form of social engineering to 
acquire user-sensitive and private information by imper- 
sonating a trustworthy third party. 

Spammers: 

Spammers are users who use electronic messaging sys- 
tems in order to send unwanted messages, like advertise- 
ments, to other users. 

Cross-Site Scripting (XSS): 

An XSS attack is an assault against web applications. 
The attacker who uses the XSS exploits the trust of 
the web client in the web application and causes the 
web client to run malicious code capable of collecting 
sensitive information. 

Internet Fraud: 

Internet fraud, also known as cyber fraud, refers to using 
Internet access to scam or take advantage of people. 


B. Modern Threats 


and gain access to his or her private information. Modern threats are typically unique to OSN environments. 
e Phishing Attacks: Usually these threats specifically target users’ personal infor- 
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mation as well as the personal information of their friends. 
The types of modern threats are listed below. 

e Clickjacking: 
Clickjacking is a malicious technique which tricks users 
into clicking on something different from what they 
intended to click. 

e De-Anonymization Attacks: 
De-anonymization attacks use techniques such as tracking 
cookies, network topology, and user group memberships 
to uncover the user’s real identity. 

e Face Recognition: 
Many people use OSNs for uploading pictures of them- 
selves and their friends. Facebook user profile pictures are 
publicly available to view and download. These photos 
can be used to create a biometric database, which can 
then be used to identify OSN users without their consent. 

e Fake Profiles: 
Ocurring through, By initiating friend requests to other 
users in the OSN, who often accept the requests, the 
socialbots can gather a user’s private data which should 
be exposed only to the user’s friends. 

e Identity Clone Attacks: 
Attackers duplicate a user’s online presence either in the 
same network. 

e Inference Attacks: 
Used to predict a user’s personal, sensitive information 
that the user has not chosen to disclose such as:-Religious 
affiliation ,Sexual orientation. 

e Information Leakage: 
Some cases OSN users willingly share sensitive informa- 
tion about themselves and other people, such as:-Health 
related information 

e Location Leakage: 
Sharing of location information many people use OSNs 
to willingly share private and sometimes sensitive infor- 
mation about their (or their friends’) current or future 
whereabouts. 

e Socware: 
Entails fake and possibly damaging posts and messages 
from friends in OSNs. 


C. Combination Threats 


Today’s attackers can also combine classic and modern 
threats in order to create a more sophisticated attack. 


D. Threats Targeting Children 


Children, whether young children or teenagers, certainly 
experience the classic and modern threats detailed above, but 
there are also threats that intentionally and specifically target 
younger users of OSNs. 

e Online Predators: 

Behaviors that are considered to be Internet sexual ex- 
ploitation of children include adults using children for 
the production of child pornography and its distribution, 
consumption of child porn, and the use of the Internet as 
a means to initiate online or offline sexual exploitation. 


e Risky Behaviors: 
Potential risky behaviors of children may include direct 
online communication with strangers, use of chat rooms 
for interactions with strangers, sexually explicit talk with 
strangers, and giving private information and photos to 
strangers. 

e Cyberbullying: 
Bullying that takes place within technological communi- 
cation platforms, such as emails, chats, phones conversa- 
tions, and OSNs, by an attacker who uses the platform to 
harass his victim by sending repeated hurtful messages, 
sexual remarks, or threats; by publishing embarrassing 
pictures or videos of the victim; or by engaging in other 
inappropriate behavior. 


VI. SOLUTIONS 
A. Social Network Operator Solutions 


OSN operators attempt to protect their users by activating 
safety measures, such as employing user authentication mech- 
anisms and applying user privacy settings. 

e Authentication Mechanisms: 

In order to make sure the user registering or logging into 
the social network is a real person and not a socialbot 
or a compromised user account, OSN operators use 
authentication mechanisms, such as CAPTCHA, photos- 
of-friends identification, multi-factor authentication, and 
in some cases even requesting that the user send a copy 
of his or her government issued ID. 

e Security and Privacy Settings: 

Many OSNs support various configurable user privacy 
settings that enable users to protect their personal data 
from other users or applications. 

e Internal Protection Mechanisms: 

Several OSNs protect their users by implementing addi- 
tional internal protection mechanisms for defense against 
spammers, fake profiles, scams, and other threats. 

e Report Users: 

OSN operators can attempt to protect young children and 
teenage users from harassment by adding an option to 
report abuse or policy violations by other users in the 
network. 


B. Commercial Solutions 


Various commercial companies have expanded their tradi- 
tional Internet security options and now offer software solu- 
tions specifically for OSN users to better protect themselves 
against threats. 


e Internet Security Solutions: 
These software suites typically include anti-virus, fire- 
wall, and other Internet protection layers which assist 
OSN users in shielding their computers against threats 
such as malware, clickjacking and phishing attacks. 

e AVG PrivacyFix: 
AVG PrivacyFix is software available as a mobile appli- 
cation or a web browser add-on which offers its users a 
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IV. Solutions for Protecting OSN Users 
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Fig. 3. Security and privacy solutions for online social networks 


simple way to manage their privacy settings on Facebook, 
LinkedIn, and Google. 

FB Phishing Protector: 

FB Phishing Protector is a Firefox add-on which warns 
Facebook users when a suspicious activity is detected, 
such as a script-injection attempt. This addon provides 
protection against various phishing attacks. 

Norton Safe Web: It scans the Facebook user’s News Feed 
and warns the user about unsafe links and sites. 
McAfee Social Protection: 

McAfee Social Protection is a mobile application which 
enables Facebook users to safeguard their uploaded pho- 
tos by letting users control precisely who can view and 


download their images. 

MyPermissions: 

Is a web service that provides its users with convenient 
links to the permissions pages for many OSNs, such as 
Facebook, Twitter, and LinkedIn. 

NoScript Security Suite: 

Is an open-source extension to Mozilla-based web 
browsers like Firefox, which allows executable web con- 
tent such as JavaScript, Java, and Flash to run only from 
trusted domains of the user’s choice. 

Privacy Scanner for Facebook: 

Trend Micro’s Privacy Scanner for Facebook is an An- 
droid application which scans the user’s privacy settings 
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and identifies risky settings which may lead to privacy 
concerns. It then assists the user in fixing the settings. 

e Defensio: 
Websense’s Defensio web service helps protect social 
network users from threats like links to malware that 
could be posted on the user’s Facebook page. 

e ZoneAlarm Privacy Scan: 
Check Point’s ZoneAlarm Privacy Scan is a Facebook 
application which scans recent activity in the user’s 
Facebook account to identify privacy concerns and to 
control what others can see. 

e Net Nanny: 
ContentWatch’s Net Nanny is software which assists 
parents in protecting their children from harmful content. 

e MinorMonitor: 
Infoglide’s MinorMonitor is a parental control web ser- 
vice which gives parents a quick dashboard view of their 
child’s Facebook activities and online friends. 


C. Academic Solutions 


These academic solutions provide cutting-edge insight into 
dealing with social network threats. They can be used by 
OSN operators to improve their users’ security and privacy, 
by security companies to offer the customers better OSN 
protection, or by early-adopter OSN users who want to better 
protect themselves. 


e Improving Privacy Setting Interfaces: 
In recent years several studies have offered OSN users 
methods and applications to help them better understand 
and improve their social network privacy settings. 

e Phishing Detection: 
Many researchers have suggested antiphishing methods 
to identify and prevent phishing attacks. 

e Spammer Detection: 
Many researchers have recently proposed solutions for 
spammer detection in OSNs. 

e Cloned Profile Detection: 
They designed and implemented a prototype which can be 
employed to investigate whether or not users have fallen 
victim to clone attacks. 

e Fake Profile Detection: 
In recent years, researchers have developed algorithms, 
techniques, and tools to identify fake profiles and prevent 
various Sybil attacks via OSNs. 

e Socware Detection: 
In the last few years, several studies have tried to better 
understand and identify socware. Discovered several in- 
sights about socware propagation characteristics that can 
assist in future research on the detection and prevention 
of socware propagation. 

e Preventing Information and Location Leakage: 
In their study on privacy leaks on Twitter, Mao et 
al.offered a “guardian angel service” that can monitor 
users’ tweets and alert users to potential privacy vio- 
lations. Their offered solution can be based on classi- 
fiers they constructed throughout their study which can 


identify tweets containing private information, such as 
vacation plans. 


VII. FUTURE RESEARCH DIRECTIONS 


The field of OSN security and privacy is a new and emerging 
one, offering many directions to pursue. Security researchers 
can continually provide better solutions to online threats; 
they can also discover new security threats to address. To 
improve the present solutions, the next step is to create synergy 
among the different security solutions. Besides the creation 
of synergy, another worthwhile direction is to apply various 
algorithms to enhance OSN security. A variety of Natural 
Language Processing (NLP) techniques and temporal analysis 
algorithms can be utilized; combining these with existing 
solutions would provide better and more accurate protection 
against social network threats. A further research direction for 
improving OSN users’ privacy is to analyze and evaluate the 
different existing privacy solutions offered by OSN operators, 
pinpointing their shortcomings and suggesting methods for im- 
proving privacy solutions. Research that develops techniques 
to better educate users about these solutions would also be 
of value, as would techniques to make users more aware of 
existing OSN threats. One additional possible future research 
direction includes studying the emerging security threats due 
to the increasing popularity of geo-location tagging of social 
network users [158] in order to offer solutions for threats with 
geosocial specificity. 


VIII. CONCLUSION 


OSNs have become part of our everyday life and, on aver- 
age, most Internet users spend more time on social networks 
than in any other online activity. We enjoy using OSNs to 
interact with other people through the sharing of experiences, 
pictures, and videos. Nevertheless, social networks have a dark 
side ripe with hackers, fraudsters, and online predators, all of 
whom are capable of using OSNs as a platform for procuring 
their future victims. The introduction of SNS features has 
introduced a new organizational framework for online commu- 
nities, and with it, a Social networks can enhance our lives, but 
we must take the correct precautions to preserve our security 
and privacy. SNS research has focused on impression man- 
agement and friendship performance, networks and network 
structure, online/offline connections, and privacy issues, and 
will concentrate on business model, new technologies, and 
mobile SNSs. These directions are also the future research 
work based on this paper. 
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Abstract—Digital watermarking is the one of the techniques 
that used to protect data’s from sharing of copyright, copyright 
management, temper detection, in documents, images, audio, 
video etc.... on internet. It is also a data hiding technique where 
an information or message is hidden inside a signal transparent 
to the user. A watermark is a secondary image, which is overlaid 
on the host image, and provides a means of protecting the 
image. In order to provide high quality watermarked image, 
the watermarked image should be imperceptible. Basically, wa- 
termarking can easily be defined as the process of embedding 
watermarks in digital media e.g., audio, video, image and text, 
using an appropriate algorithm. The purpose of watermark is 
vast including the identification of work and discourages its 
unauthorized use. 

This study, proposes digital image watermarking technique 
based on discrete wavelet transform (DWT) and encryption. 
Watermark embedding and extraction algorithm using DWT 
coefficients. Video Watermarking is one of the interesting fields 
to develop a system with authentication and copyright protection 
methodology embedded within an efficient video codec. Thus, this 
technique can be used for copyright protection, piracy tracing, 
content authentication, advertisement surveillance, and so forth. 


Index Terms—Digital watermarking, image watermarking, 
discrete wavelet transform, spatial domain, frequency domain, 
video watermarking. 


I. INTRODUCTION 


internet. This data are stored and transmitted in a 

digital format and can easily be copied without loss of 
quality and efficiently distributed. That’s why protection has 
become increasingly important. Thus for hiding multimedia 
information, watermarking is a relative new technique. Its ap- 
plication is broad, including data authentication, protection of 
ownership, broadcast monitoring etc. Basically watermarking 
can easily be defined as the process of embedding watermarks 
in digital media e.g., audio, video, image and text using an 
appropriate algorithm. 

The purpose of watermarking includes the identification of 
work and discourages its unauthorized use etc. Digital water- 
marking is a very developing field and it is used in various 
applications most of which proved to be successful. The aim of 
every application is to provide security to the digital content. 
The efficiency of digital watermarking algorithms is totally 
based on the robustness of the embedded watermark against 


T= are tons of data that is distributed over the 
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various types of attacks Salt and Pepper noise and JPEG 
compression etc.. Of the many approaches possible to protect 
visual data, digital watermarking is probably the one that has 
received most interest. The goal is to produce an image that 
looks exactly the same to a human eye. In fact any image wa- 
termarking technique can be extended to watermarking videos, 
but in reality video watermarking techniques need to meet 
other challenges like video coding technologies, large volume 
of data, blind watermarking detection, the unbalance between 
motion and motionless region, some special attacks like frame 
averaging, frame swapping, statistical analysis and other real- 
time features than that in image watermarking scheme. 


II. CHARACTERISTICS OF DIGITAL WATERMARKING 


Digital watermarking categorized into important categories 
as follows: 
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1) 


2) 


3) 


4) 


5) 


1) 


2) 


3) 


4) 


5) 


6) 


1) 


8) 


Robustness: 

The watermark will have to be competent to resist after 
normal image processing operations eguivalent to picture 
cropping, transformation, compression and so forth. 
Imperceptibility: 

The watermarked photograph must appear like same as 
the usual picture to the typical eye. The observer can't 
detect that the watermark is embedded in it. 

Security: 

An unauthorized any person can’t realize, retrieve or 
alternate the embedded watermark. 

Transparency: 

It transmits to the resources of the human sensory. An 
observable watermark justification, no artifacts or char- 
acteristic loss. 

Capacity: 

It describes how many know-how bits may also be 
constant. It additionally addresses the likelihood of em- 
bedding more than one watermarks in one report in 
parallel. Capability requirement continually effort against 
two different main requisites, that’s, imperceptibility and 
robustness. A greater capability is in general bought at 
the rate of both robustness strength or imperceptibility, 
or each. 


III. PROPERTIES OF DIGITAL WATERMARKING 


Robustness: 

The watermark should be impossible to remove even if 
the algorithmic principle of the watermarking method is 
public. 

Unambiguous: 

The retrieved watermark should uniquely identify the 
copyright owner of the content, or in case of fingerprint- 
ing applications, the authorized recipient of the content. 
Loyalty: 

A watermark has a high reliability, if the degradation it 
causes is very difficult to perceive for the viewer. 
Computational cost: 

Embedding and extraction of watermark from the video 
both should be fairly fast and should have low computa- 
tional complexity 

Interoperability: 

Watermark system must be interoperable for the com- 
pressed and decompressed operations. 

Universal: 

The same digital watermarking algorithm needs to be 
applicable for all three media under consideration. This 
feature is favourable for the implementation of audio 
and image/video watermarking algorithms on common 
hardware as well. 

Unobtrusive: 

The watermark needs to be perceptually invisible. 
Effectiveness: 

It’s the possibility that the information in a watermarked 
picture will probably be correctly detected; it ideally 
desires this chance to be 1 


9) 


10) 


A. 


Image fidelity: 

Process of the watermarking that changes an original 
image to add an information to it; therefore, it certainly 
affects the image quality. We want to keep this poverty of 
the image quality to a minimum, so no obvious variation 
in the image fidelity can be noticed. 

Payload size: 

In which every watermarked work is used to contain an 
information. 


IV. CLASSIFICATION OF DIGITAL WATERMARKING 


Digital Watermarking 
Spatial 
Domain 
Technique 


Predictive 
Coding DWT || DCT DFT 


Fig. 2. Classification of Digital Watermarking 
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Spatial Domain 


Spatial domain watermarking slightly modifies the pixels 
of one or two randomly selected subsets of an image. Spatial 
domain is classified in to three: 


1) 


2) 


3) 


B. 


Least Significant Bit(LSB): 

In this technique watermark is embedded in the LSB of 
pixels. Two types of LSB techniques are proposed. In the 
first method the LSB of the image was replaced with a 
pseudo-noise (PN) sequence while in the second a PN 
sequence was added to the LSB. This method is easy to 
use but not very robust against attacks. 

Patchwork Technique: 

In patchwork, n pairs of image points, (a, b), were ran- 
domly chosen. The image data in a were lightened while 
that in b were darkened. High level of robustness against 
many types of attacks are provided in this technique. But 
here in this technique, very small amount of information 
can be hidden. 

Predictive Coding Scheme’: 

In this method, a pseudorandom noise (PN) pattern says 
W(x, y) is added to cover image. It increases the ro- 
bustness of watermark by increasing the gain factor. But 
due to high increment in gain factor, image quality may 
decrease. 


Frequency Domain 


In frequency domain techniques the image is first trans- 
formed to the frequency domain by the use of any trans- 
formation methods such as Fourier transform, discrete cosine 
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transform (DCT) or discrete wavelet transform (DWT). It is 
also divided in to three: 


1) Discrete Cosine Transform (DCT): 
First of all, image is segmented into non overlapping 
blocks of 8x8.Then forward DCT is applied to each of 
these blocks. After that some block selection criteria is 
applied and then coefficient selection criteria are applied. 
Then watermark is embedded by modifying the selected 
coefficients and in the end inverse DCT transform is 
applied on each 8x8 block. 

2) Discrete Wavelets Transform (DWT): 
It is more frequently used due to its time/frequency 
characteristics. Here an image is passed through series 
of low pass and high pass filters which decompose the 
image into sub bands of different resolutions. Image is 
decomposed into four parts, one part is a low frequency 
of original image, the one bottom left is vertical details 
of the original image, the top right contains horizontal 
detail of the image, the bottom right block contains high 
frequency of original image. This technique uses wavelet 
filters to transform the image. 

3) Discrete Fourier Transform (DFT): 
It transforms a continuous function into its frequency 
components. Discrete Fourier transform is scaling, rota- 
tion and translation invariant whereas the spatial domain 
DCT and DWT are not RST invariant. So DFT can be 
used to recover from various geometric attacks such as 


cropping. 
V. TYPES OF DIGITAL WATERMARKS 


Watermarks and watermarking techniques can be divided 
into various categories in various ways. 

The digital watermarks can be divided into three different 
types as follows: 


1) Visible watermark: 
Visible watermark is a secondary translucent overlaid into 
the primary image. 

2) Invisible-Robust watermark: 
The invisible-robust watermark is embedded in such a 
way that alternations made to the pixel value is per- 
ceptually not noticed and it can be recovered only with 
appropriate decoding mechanism. 

3) Invisible-Fragile watermark: 
The invisible-fragile watermark is embedded in such a 
way that any modification of the image would alter or 
destroy the Watermark. 


According to the type of document to be watermarked, 


technique which helps to protect the authenticity and 
integrity of text documents by inserting watermarks in 
the text. A text, being the simplest mode of communica- 
tion and information exchange, brings various challenges 
when it comes to copyright protection. Any transfor- 
mation on text should preserve the meaning, fluency, 
grammaticality, writing style and value of the text. Short 
documents have low capacity for watermark embedding 
and are relatively difficult to protect. Text watermarking 
algorithms are also dependent on text size, its language, 
rules, grammar, conventions and writing styles. 

2) Audio Watermarking: 
An audio watermark is a unique electronic identifier 
embedded in an audio signal, typically used to identify 
ownership of copyright. It is similar to a watermark on 
a photograph. Watermarking is the process of embedding 
information into a signal (e.g., audio, video or pictures) 
in a way that is difficult to remove. If the signal is 
copied, then the information is also carried in the copy. 
Watermarking has become increasingly important to en- 
able copyright protection and ownership verification. One 
of the most secure techniques of audio watermarking is 
spread spectrum audio watermarking (SSW). In SSW, 
a narrow-band signal is transmitted over a much larger 
bandwidth such that the signal energy presented in any 
signal frequency is undetectable. Thus, the watermark is 
spread over many frequency bands so that the energy in 
one band is undetectable. 

3) Image Watermarking: 
Figure shows the general block diagram of digital 
image watermarking. Digital Image Watermarking can 
protect image, video, audio from unauthorized person, 
noise, copyright etc. The best-known image watermarking 
method that works in the spatial domain is the Least 
Significant Bit (LSB), which replaces the least significant 
bits of pixels selected to hide the information. 

4) Video Watermarking: 
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Image EUO Detection 
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Fig. 3. Block Diagram of Digital Image Watermarking 


VI. ALGORITHMS 


watermarking techniques can be divided into four categories 


as follows: We consider the algorithms for watermark embedding and 


1) Text Watermarking: watermark extraction. 
Nowadays, with the extensive use of internet all over the 


world, most of the government, public and private data are ^- Watermark Embedding Algorithm 


increasingly being published on internet. The protection 
of these online documents is the need of the hour and 
need to be dealt with urgently. Text watermarking is the 


1) Firstly, an image encryption algorithm based on row and 
column rotation through random number generator key k 
of the watermark image is performed. 


Goury Priya M S et al., A Study on Digital Watermarking 


83 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


Fig. 4. (a) Original watermark image (b) Encrypted watermark image 


2) Secondly, the original input image is decomposed using 
two-dimensional (2D) DWT to obtain the relevant scaled 
images with reduced size. Also, the encrypted watermark 
image is decomposed using 2D DWT to obtain decom- 
posed scaled watermark images.In this Figure it shows 
the multi resolution decomposed images obtained after 
2D DWT on original input and watermark image. 


Fig. 5. Decomposition of image through 2D DWT a)original input image 
b)watermark image 


3) Thirdly, the pixel point at decomposed input image for 
embedding of the decomposed watermark image was 
identified based on Euclidean distance. More is the sim- 
ilarity between the input and watermark image, visibility 
of the input image does not change and increases the 
strength of watermark. Thus, it is more suitable for 
embedding watermark into the input image. 

4) Fourthly, encrypted watermark was embedded into input 
image using (1) depending on the match between decom- 
posed images of input and encrypted watermark image. 


where a is visibility coefficient, i(i, j) are DWT coef- 
ficients of respective decomposed input image, iw(i, 7) 
are DWT coefficients of respective decomposed water- 
mark image, and y(i, j) are the DWT coefficients of the 
watermark embedded output image. 


B. Watermark Extraction and Detection Algorithm 


The watermark extraction is exactly reverse procedure of 
watermark embedding. The algorithm presented in this paper 
is non-blind and therefore requires original input image and 
encryption key for watermark extraction and detection. The 
similarity between the original watermark image and extracted 
watermark image was measured using three parameters: 

e Mean Square Error (MSE) 

e Normalized Correlation Coefficient(CC) 

e Peak Signal to Noise Ratio(PSNR) 

In general, value of CC > 0.75 and PSNR > 30 dB is 
considered acceptable. Also, it is necessary to evaluate these 
watermarking parameters at various signal processing attacks. 
Watermark embedding and extraction algorithm is shown in 
Fig. 6. 


Distance 
ii Measurement 


Fig. 6. Watermark Embedding 


C. Results and Discussion 


Watermark embedding and extraction algorithm was imple- 
mented using MATLAB software and executed on intel i5 
processor with 1 GB RAM and 3 GHz processing speed. Lena 
image of size 228 x 228 and baboon image of size 90 x 90 
were selected as input and watermark images 


D. Digital Image Watermarking Through Encryption 


The performance of the presented algorithm is evaluated 
through three parameters MSE, CC, and PSNR. The compari- 
son of the experimentally obtained parameters was performed 
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Watermark 
Embedded 


Fig. 7. Watermark Extraction 


Fig. 8. (a) Original input image (b) Watermark embedded image (c) Original 
watermark image (d) Extracted watermark image 


under two different conditions with and without attacks. Three 
general attacks such as salt-and-pepper noise, geometrical 
attack through rotation and JPEG compression attack were 
considered ,here salt-and-pepper noise of density 0.01, com- 
pression ratio of 2, and rotation of 90° was added into the 
watermark embedded image. 
Any image watermarking techniques can be extended to 
watermarking videos. 
1) Generation and Embedding: 
In embedding, an algorithm accepts the host and the data 
to be embedded,and produces a watermarked signal. The 


Fig. 9. (a) Salt-and-pepper (b) Rotation (c) Ccompression 
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Fig. 10. Block Diagram of Video Watermarking System 


signal where the watermark is to be embedded is called 
the host signal. 

2) Distribution and Possible Attacks: 
The distribution process can be seen as the transmission 
of the signal through the watermark channel. Possible 
attacks in the broadcast channel may be intentional or 
accidental. 

3) Detection: 
Detection process allows the owner to be identified and 
provides information to the intended recipient. 


VII. CLASSIFICATION OF DIGITAL VIDEO 
WATERMARKING TECHNIQUES 


A. According to the types of carriers 


Original Video Encoded Video| Video Decoded 
video Encoder through Channel | Decoder video 
Embed 1 | Embed3 | Extract 3 


Extract 1 
Fig. 11. Block Diagram of Video Watermarking according to types of carriers 


Extract 2 


1) Embed/Extract 1: 
Watermark is directly embedded into the Original video 
sequences and after that watermark containing video 
sequence is encoded. Advantage of this type is we can 
embed watermark easily but the disadvantage is that it 
will increase the bit rate of video data stream and also 
after compression watermark may be lost. 

2) Embed/Extract 2: 
Watermark embedding and detection are done at the 
encoder and decoder. Advantage of this type is that it 
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does not increase the bit rate of video data stream and it 
is relatively simple method of watermark embedding in 
the transform domain. 
3) Embed/Extract 3: 

Watermark is embedded into the compressed domain. 
Advantage of this type is computational complexity is 
lower compare to other types, but the disadvantage is 
that the compressed bit rate constraints the size of the 
watermark data. 


B. According to the types of Domains 


1) Spatial Domain Video Watermarking Techniques: 
The watermark design and the watermark insertion pro- 
cedures do not involve any transforms. Simple techniques 
like addition or replacement are used for the combination 
of watermark with the host signal and embedding takes 
place directly in the pixel domain. 

2) Least Significant Bit modification (LSB): 
Least Significant Bit (LSB) method is the simplest tech- 
nique of this domain. In this scheme the watermark 
is simply embedded into the least significant bits of 
the original video. Due to its simplicity, it is the most 
popular scheme, but some limitations are also there like, 
poor quality of the produced video, inefficient in dealing 
with the various attacks, least robustness and lack of 
imperceptibility. 

3) Correlation based techniques: 
A pseudo-random noise (PN) pattern W (x, y) is added to 
the cover image I(x, y), according to the equation shown 
below: 


Iw(a,y) =I(a,y) +k x W(2,y) 


In equation above k is a gain factor and Iw is the 
watermarked content. As we increase the value of k, it 
will expense the quality of watermarked contents. 

4) Frequency Domain Video Watermarking Techniques: 
In frequency domain techniques, the watermark is em- 
bedded by modifying the transform coefficients of the 
frames of the video sequence. The most commonly used 
transforms are the Discrete Fourier Transform (DFT), 
the Discrete Cosine Transform (DCT), and the Discrete 
Wavelet Transform (DWT). Generally, the main drawback 
of transform domain methods is their higher computa- 
tional requirement. 

5) SVD Domain Video Watermarking Technique: 
Singular Value Decomposition (SVD) is a numerical tech- 
nique for diagonalzing matrices in which the transformed 
domain consists of basis states that is optimal in some 
sense. The SVD of an N x N matrix A is defined by the 
operation: 

A=USVT 


Where U and V are unitary, and S is a N x N diagonal 
matrix. The diagonal entries of S are called the singular 
values of A and are assumed to be arranged in decreasing 
order. Embedding watermark information in the diagonal 


elements of matrix U or matrix V showed more robust- 
ness against noise than embedding in matrix S 


C. Discrete Fourier Transform Video Watermarking Technique 


This approach first extracts the brightness of the water- 
marked frame, computing its full-frame DFT taking the magni- 
tude of the coefficients. DFT-based watermarking scheme with 
template matching can resist a number of attacks, including 
pixel removal, rotation and shearing. The purpose of the tem- 
plate is to enable resynchronization of the watermark payload 
spreading sequence. It is a key dependent pattern of peaks, 
which is also embedded into DFT magnitude representation 
of the frame. 


D. Discrete Cosine Transform Video Watermarking Technique 


Discrete Cosine Transform (DCT) is an important method 
for video watermarking. A lot of digital video watermarking 
algorithms embed the watermark into this domain. The us- 
ability of this transform is because that most of the video 
compression standards are based on DCT and some other 
related transforms. In this domain some DCT coefficients of 
the video are selected and divided into groups, and then the 
watermark bits are embedded by doing adjustment in each 


group 
E. Discrete Wavelet Transform Video Watermarking Technique 


The distributions of the frequency is transformed in each 
step of DWT, where L represents Low frequency, H represents 
High frequency and subscript behind them represents the 
number of layers of transforms. Sub graph LL represents the 
lower resolution approximation of the original video, while 
high-frequency and mid-frequency details sub graph LH, HL 
and HH represents vertical edge, horizontal edge and diagonal 
edge details. The process can be repeated to compute the 
multiple scale wavelet decomposition as shown in figure. 
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Fig. 12. DWT in Square mode 
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F. Principal Component Analysis Video Watermarking Tech- 
nique 


Principal component analysis (PCA) is a mathematical 
procedure that uses an orthogonal transformation to convert a 
set of observations of correlated variables into a set of values 
of uncorrelated variables called principal components. PCA 
plots the data into a new coordinate system where the data 
with maximum covariance are plotted together and is known 
as the first principal component. Similarly, there are the second 
and third principal components and so on. The first principal 
component has the maximum energy concentration. 


VIII. ATTACKS ON WATERMARK 


1) Frame dropping: 
Dropping one or more frames randomly from the water- 
marked video sequence. 

2) Frame averaging: 
Remove dynamic composition of the video watermarked 


3) Frame swapping: Switching the order of frames randomly 
within a watermarked video sequence. 

4) Intentional attacks: 
Includes Single frame attacks like filtering attacks, con- 
trast and color enhancement and noise adding attack. 
Or statistical attacks like averaging attack and collision 
attack. 

5) Unintentional attacks: 
Due to Degradations that can occur during glossy copy- 
ing, or due to Compression of the video during re- 
encoding or because of Change of frame rate and Change 
of resolution 


IX. APPLICATIONS OF DIGITAL WATERMARKING 


e Copyright protection: 
It’s inured to recognize and defend consistent file posses- 
sion. 

e Tamper proofing: 
It is intended for delicate in nature. 

e Fingerprinting: 
Fingerprints are the description of an object that tends to 
differentiate it from other small objects. 

e Medical application: 
Names of the sufferers can be printed on the X-ray 
experiences and MRI scans utilizing strategies of visible 
watermarking. 

e Image and content authentication: 
In this application the objective is to detect modification 
of the data. 

e Owner Identification: 
It establishes ownership of the content. 

e Broadcast Monitoring: 
Specially for advertisements and in entertainment indus- 
tries, to monitor content that is broadcast as contracted 
and by the authorized source. 

e Authentication of Content: 


To detect modifications of the content as a sign of invalid 
authentication. 

e Data hiding: 
The transmission of private data is probably one of the 
earliest applications of watermarking. 


X. CONCLUSION 


This paper gives a detailed study on various digital wa- 
termarking techniques and their applications. It gives us a 
relative analysis of various watermarking techniques as well 
as general procedure of watermark embedding and extraction. 
Different techniques of digital image watermarking, based on 
spatial and frequency domain techniques have been discussed. 
Robustness of the algorithm for general attacks such as salt 
and-pepper, noise, rotation, and compression is demonstrated. 
Digital Image Watermarking can protect image, video, audio 
from unauthorized person, noise, copyright etc. DCT and 
DWT domain watermarking is comparatively much better than 
the spatial domain encoding since DCT domain watermarking 
can survive against the attacks such as noising, compression, 
sharpening, and filtering and also use JPEG compression 
method and high frequency sub bands as LH.HL,HH etc. In 
this paper, we tried our best to give the whole information 
of digital watermarking which will give benefit to new re- 
searchers to get the maximum awareness about this domain. 
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Findings on Digital Forensics and IOT Devices 
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Abstract—With the innovations and constant development in 
technology the rate of cybercrimes is also on the rise. There is the 
potential for maximum of people to become victims to the grow- 
ing pool of criminals. Digital forensics is an emerging technique 
that applies detection and investigation of computer based crimes 
and gathering digital evidences suitable for presentation in court 
The use of mobile phones has seen a remarkable increase since the 
past decade. However with an increase in use, mobile phones have 
now become a potential source for criminal activities. There is a 
need to examine these mobile devices in order to acquire evidence 
and gain meaningful insights from them. Mobile forensics is the 
branch of digital forensics which aims at investigating the digital 
evidence recovered from a cell phone that can provide a wealth 
of information in a forensically sound manner. The market is 
flooded with open source and proprietary mobile phone operating 
systems as a result of which the techniques and tools that are 
currently available fail to gain complete insight from the devices, 
and finding the appropriate tool is a challenge. 

Index Terms—Digital forensics, system for shortlisting suspects, 
mobile forensics, data acquisition techniques, cyber-attacks on 
IoT devices. 


I. INTRODUCTION 


IGITAL forensics or digital forensic science is a branch 
of forensic science focused on the recovery and in- 


vestigation of material found in digital devices and 
cybercrimes. We propose a digital forensic system called SISC 
that can automatically short-list suspects by categorizing the 
attributes of habitual criminals stored in the database using 
decision tree, logistic regression, and chi-squared analysis 
techniques. 

Mobile forensics is the branch of digital forensics which 
aims at investigating the digital evidence recovered from 
a cell phone in a forensically sound manner. It helps to 
carefully gather and analysis the evidence that exists in a 
mobile phone in the form of text messages, chats, call logs, 
multimedia files, browsing history, GPS location and so on 
without compromising on the integrity of the data[6]. One of 
the important concerns is to get access to the deleted data 
which resided in the mobile phones. IoT is a network of 
machines that communicate with each other, making smart 
decisions that help people and corporations to take control of 
their ventures. In recent years, the prevalence of IoT devices 
has increased dramatically, a phenomenon that has made users 
more vulnerable to cyber-attacks. As IoT becomes a staple of 
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modern life, it is now essential to take cyber security seriously 
if cyber-attacks[5] are to be prevented. It is therefore necessary 
to enhance the security of connected devices. Attacks are often 
traced to a weak link in the security network, meaning that 
connected devices serve as the chief target for cyber-attackers. 
Hackers are active in detecting these weak links to take control 
of these devices advantages. 


II. A FORENSIC SYSTEM FOR IDENTIFYING THE 
SUSPECTS OF A CRIME 


Digital Forensics has emerged as a promising tool for 
forensic investigators to identify criminals and criminal com- 
munities. We propose in this paper a digital forensic system 
called SISC “System for Shortlisting Suspects” that can predict 
the prime suspects of a crime with no solid material evidences 
identified by traditional means. SISC can help forensic in- 
vestigators short-list the likely suspects. The following is an 
overview of the sequential processing steps taken by SISC to 
short-list the suspects: 


A. Constructing a Decision Tree 


First, SISC ranks the categorization attributes based on 
their Information Gains (IG)[7]. Then, it constructs a decision 
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tree [8] by placing each node representing a categorization 
attribute in a hierarchical level in the tree that corresponds 
to its rank., the entropy of each distinct data value of the 
categorization attribute in the sub-table is calculated. Entropy 
(ENT) is computed using the formula shown in Equation 1. 
ENT measures the uncertainty. The higher the uncertainty, the 
higher the entropy is. ENT of a pure table, 


ENT = X | pj — loge; (1) 
j 
where pj is the probability of a Categorization Attribute j. 


IG = ENT (parent) — WeightedSumO fENT (Children) 


B. Using Logistic Regression to Estimate the Linear Decision 
Boundary of a Categorization Attribute 


SISC uses logistic regression to estimate the linear decision 
boundary of a Categorization Attributes by determining the 
threshold that divides the suspects defined by the attribute 
based on the likelihood of committing a crime similar to 
the one under consideration.Let P(Y = 1/73,.....Tn) be the 
probability that the event Y = 1 happens given variables 
Upri £n . By taking (natural) logarithm of the odds, we get 
”log odds”. The coefficients 8; for the (multi)linear regression 
of variables £1, ....... Tnare estimated as follows: 


1 P(Y =1\a1,.....0n 
POY = 0a, .....2n 


Let y=Go + 81 Suppose that BO and B1 are estimators of the 
coefficients (69/3, respectively. Equation 4 shows BO and B1 
that make the Ordinary Least Squares (OLS) minimizing the 
sum of the squares of the differences between the observed 
and predicted responses 


) = Bo + Biti + Bn En (2) 


OLS = Soi — Bo — Bia)" (3) 
i=1 
C. Using Chi-squared Analysis to Short-list Suspects 


Finally, SISC employs chi-squared analysis to identify a 
subset of suspects, who are likely to have committed a crime 
under consideration using the structure of the decision tree and 
the decision boundaries of its Categorization Attributes. Chi- 
squared analysis allows SISC to identify the path (i.e., branch) 
in the decision tree, whose nodes contain the likely suspects 
to have committed a crime under consideration. Therefore, 
SISC computes the chi-squared value for each path in the 
decision tree and compares the values. Usually, paths that yield 
higher chi-squared values contain potential suspects for a case 
under consideration. The nodes of the path p that yields the 
highest chi-squared value contain the likely suspects of the 
crime. Specifically, the leaf node of the path p contains the 
short-listed suspects, who are likely to have committed the 
crime under consideration. Thus, the forensic investigators can 
focus their investigation on the short-listed suspects contained 
in this leaf node. The chisquared (x)? is computed as shown 
in Equation 5 [3]. 


(x)? =(O-E)/E (4) 


where O and E are the observed and expected frequencies, 
respectively, of the data. The expected frequency (E) for each 
possible value of the variable is computed using Equation 


E=np (5) 


where n is the size of the sample and p is the relative frequency 
(or probability). 


III. DATA ACQUISITION TECHNIQUES IN MOBILE 
FORENSICS 


Mobile devices are source of large amount of digital data. 
Mobile forensics is the branch of digital forensics which aims 
at investigating the digital evidence recovered from a cell 
phone in a forensically sound manner. It helps to carefully 
gather and analysis the evidence that exists in a mobile phone 
in the form of text messages, chats, call logs, multimedia files, 
browsing history, GPS location and so on without compro- 
mising on the integrity of the data[6]. One of the important 
concerns is to get access to the deleted data which resided in 
the mobile phones. 


A. Steps in Mobile Forensics 


e Identification: 
This step deals with physically identifying the mobile 
device that can prove to be a potential source of investi- 
gation activity. 

e Preservation: 
This step deals with isolating the mobile device from the 
outside world in order to avoid the contamination of the 
data present within it. 

e Acquisition: 
This step deals with obtaining a mirror image of the 
device so as avoid loss of information that can be caused 
due to real time factors such as battery drainage, physical 
damage, etc. Data resides in mobile phones within the 
SimCard, internal and external memory dump locations 
within the files and directory structures. A major concern 
is to acquire the deleted data that has been deleted on 
purpose by the activist. 

e Analysis: 
Data acquired is carefully examined and analysed in order 
to gain meaningful insights from them. 

e Documentation: 
This step aims at preparing document which gives an 
account of all the insights that have been obtained from 
the investigation process. 

e Presentation: 
This is the final step which aims at presenting the evi- 
dence in front of court proceedings so as to get accepted 
in front of the judiciary. 


B. Data Acquisition Techniques 


Acquisition of data is an important step in Mobile Forensics. 
Acquiring both deleted and undeleted data present in mobile 
phones is necessary in order to discover relevant artifacts 
and find meaningful insights from them. Manual acquisition 
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requires manual examination of the mobile devices. This 
technique is the simplest one and is suitable when there is 
an immediate need to obtain evidence from the device. 

Logical acquisition requires a bit-by-bit copy of logical 
storage objects obtained from the allocated spaces in the 
memory of the mobile device. 

Physical acquisition requires a bit-by-bit copy of entire 
physical storage of the device. This technique allows extraction 
of deleted data which is not possible in the case of logical 
acquisition. Chip-off which is a part of physical acquisition 
requires physical removal of flash memory by disordering the 
chip from the device. This technique is however difficult and 
expertise training is needed for the forensic examination. 

Mobile Forensics Acquisition Methods The acquisition tech- 
niques involved in Mobile Forensics are quite different from 
that of other digital forensics techniques. Logical and Physical 
are the important acquisition techniques involved in Mobile 
Forensics. 

1) Logical Acquisition Techniques: Logical based acqui- 
sition techniques work on the principle of acquiring bit-by- 
bit copy of logical storage objects from the allocated spaces. 
Logical acquisition techniques fail to recover the data from 
the slack spaces and hence cannot obtain the deleted data. 
They are best suitable for unrooted devices even though the 
data recovered is less as compared to physical acquisition. The 
only requirement for performing logical acquisition is to have 
USB debugging mode enabled on the mobile device. 


e ADB Pull: 
On unrooted mobile devices the ADB daemon running on 
the device runs with the shell permissions. As a result, 
maximum evidential files are not accessible. An ADB 
pull can still access useful files such as unencrypted apps, 
most of the temporary file systems that can include user 
data such as browser history, and system information. 
However, if you have root privileges, then ADB pull 
method is effective to analyse the files of interest from 
the workstation [13]. 

e Backup Analysis: 
Many backup options available to store the mobile device 
data in a particular location and restore it back as needed. 
Many of the backup utilities have a SD Card option as 
well as options to save data to the cloud [13]. Backup 
analysis utilizes a backup image obtained from the phone 
in order to carry out the investigation. 
AF Logical: 
It is an Android forensics logical technique which is 
distributed free to law enforcement and government agen- 
cies. The app, developed by via Forensics is available on 
Github and extracts data using Content Providers [14]. 
The AF Logical app takes advantage of the Content 
Provider architecture to gain access to data stored on the 
device. Some examples of Content Providers are: 


- SMS/MMS 
— Contacts 
— Calendar 


e Commercial Providers: 

Many of the commercial mobile forensic software ven- 
dors now support Android based acquisition. It can be 
helpful for a forensic examiner to understand how each 
of the forensic software vendors implement Android 
support. Example of Commercial Providers are: 

— Celebrite UFED 

— CompelsonMOBILedit 

— viaForensic’s viaExtract 


For a comparison of logical acquisition techniques see 
Figure 2. 
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Fig. 2. Comparison of Logical acquisition techniques 


2) Physical Acquisition Techniques: Physical based acqui- 
sition includes bit-by-bit copy of an entire physical storage. 
It allows extracting deleted, obsolete data along with the 
other contents from a mobile device. It is performed after 
gaining the root level access of mobile devices so as to get 
complete control to the system. However rooting a phone 
causes modifications to the device data. 

3) Hardware Based Acquisition: This technique consists 
of methods which connect a hardware to the device or physi- 
cally extract device components. The hardware-based methods 
works specially on unrooted devices but require forensic 
analysts who have expertise training in forensic. 

e JTAG: 

Joint Test Action Group has specified specifications for 
printed circuit boards(PCB) testing and debugging called 
boundary scan. Investigators use JTAG to access and 
retrieve content of memory chips and create forensic 
images of these chips[15]. 

e Chip-off: 
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This technique allows for recovery of damaged devices 
and also circumvent pass code-protected devices. How- 
ever, the physical removal of NAND chips often damages 
the connectors on the bottom of the chip, and as a result 
damage the data[13]. 


For a comparison of physical acquisition techniques see 
Figure 3. 
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Fig. 3. Comparison of physical acquisition techniques 


4) Software Based Acquisition: This technique makes use 
of software on devices in order to obtain digital artifacts. It 
does not involve physical removal of the hardware compo- 
nents; as a result, it does not cause permanent damage to 
the device. According to [13], this acquisition works only 
with root privileges with USB debugging enabled. From a 
forensics standpoint, temporary root privileges or root access 
via a custom recovery mode are preferred. Many commercial 
and open source vendors are available who provide physical 
acquisition softwares such as: 


e Cellebrite UFED 
e Oxygen Forensic Suite 
e Wondershare Dr. Fone for Android 


IV. IOT DEVICES 


IOT refers to any device that is connected to the Internet. 
These devices can communicate with each other over networks 
and make smart decisions based on the data shared between 
them [16]. IOT creates a new world in which people and 
corporations can control their assets effectively and make 
informed decisions about what they want to do. IOT has 
recently become widely available and this will initiate multiple 
useful societal changes, improving our accessibility, health, 
and safety [17]. 

IoT devices can be classified into three categories: 


1) Wearable devices: 
Wearable devices, which tracks individuals movements or 
actions [18], are commonly connected to smartphones via 
Bluetooth and, from there, to the Internet. They have a 
significant impact on our lives, particularly with regard 


to healthcare and communication. Currently, the wearable 
market is dominated by fitness gadgets, especially by 
wearables that monitor one’s heart rate or count steps. 
The wearable market is comprised of the following 
devices: smartwatches, headwear, smart clothes, smart 
jewellery, and other devices that facilitate mindfulness 
[19]. 

2) Smart home devices: 
Products from this branch of IOT are commonly con- 
nected to the Internet via a wireless connection to the 
home router. They include all domestic devices, from 
lamps and light switches to motion sensors, from gate 
locks to automatic curtains [17]. 

3) M2M (Machine to Machine) devices: 
They allow networked devices to transfer data and ex- 
ecute operations without manual human involvement. 
M2M devices are used in many fields. In telemedicine, for 
example, M2M devices facilitate the real-time monitoring 
of patients’ live statistics, provide medicine when needed, 
and track healthcare assets. M2M devices play a crucial 
role in remote control, robotics, traffic control, security, 
fleet control, and the automotive industry. 


A. Cyber Attacks on Internet of Things 


Cyber-attacks on the IOT are major challenge in current 
scenario, and it distracts the actual functioning of these 
devices. It is necessary to aware about the IOT security 
issues and other challenges of cyber-attacks. A cyber-attack 
is an attack launched from one or more computers toward 
another computer, multiple computers, or entire networks. 
These attacks can be divided into two broad categories: attacks 
that aim to damage or deactivate the destination computer and 
attacks where the purpose is to obtain access to the target 
computer’s data and potentially gain admin privileges on it 
[21]. 

Cyber-attacks take multiple forms. One of the main aims 
of active network attacks is to observe unencrypted transfer 
of sensitive data. Alternatively, passive attacks often involve 
monitoring unsafe network connections. These attacks can 
target any IoT devices. They are used to inflict damage, 
disable system control, or gain access to the victim’s private 
information [22, 23]. On a larger scale, government networks, 
such as service systems, are attacked to prevent water or 
electricity supply to residents. Domestic examples include 
attacks on home automation systems, whereby attackers gain 
control of air conditioning, heating systems, lighting, and 
physical security systems. The information obtained from 
sensors used in heating and lighting systems could notify the 
invader when someone is not home. 

The rise of cyber-attack is considered as a big threat for IoT 
devices, so it causes a necessity for concentrating security of 
these devices. Both governments and businesses are investing 
heavily in data security measures. They are utilizing a wide 
variety of tools and techniques to manage the security threats, 
while adversaries are attempting to breach security through 
circulating malicious software, such as viruses, botnets, and 
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Trojans, to obtain important data. It is crucial to identify 
such attacks in their early stages, not only when they are 
actually taking place, so as to protect our systems with 
effective security measures. In the cyber security world, it 
is hard to foretell a possible attack without understanding 
the vulnerability of the network. It is therefore crucial to 
identify and explain the different existing cyber-attack methods 
in order to strengthen the vulnerabilities in our networks. 
Cyber-attackers or hackers performing these to financial gain, 
spying, enhance competition and sometimes for their personal 
entertainment. It can be achieved through different kinds of 
cyber-attacks. They are as follows: 


1) Malware: 
This refers to any type of software that is created to 
inflict damage to a personal computer, server, or entire 
network. Trojans, worms, and viruses are each distinct 
forms of malware. These attacks may render the computer 
or network inoperable or grant the attacker root access, 
enabling them to manage the system remotely. It takes 
three basic forms [20]: 

2) Spyware: 
Spyware is the most common form of malware used in 
the robbing of useful information. Spyware, as the name 
suggests, spies on the users to obtain the information they 
enter on their computer or browser. 

3) Viruses: 
A computer virus, similar to a flu virus, is designed to 
spread from host to host and has the capacity to replicate 
itself. Moreover, similar to how flu viruses cannot survive 
without a host cell, computer viruses cannot spread in the 
absence of host programs, such as files or documents. 

4) Worms: 
Worms, though far less common, can be more of a men- 
ace due to their ability to survive independently. Worms 
behave similarly to viruses except worms seek to destroy 
the host. This is particularly dangerous because contagion 
can grow much faster when they are not tethered to a host. 
The fact that no hosts are required also makes worms 
harder to catch. 

5) Man in the middle attacks: 
Attackers use this method to covertly place information 
between the user and the web service they are trying to 
access. For example, an attacker might set up a Wi-Fi 
network with a login screen intended to simulate a hotel 
network; when a user logs in, the attacker can access any 
data the user inputs, including bank passwords. 

6) Denial-of-service attacks: 
This is a brute strength method that attempts to prevent 
online services from running smoothly. For example, 
create huge amounts of traffic on a website or send 
multiple calls to a database, thereby limiting the system’s 
ability to operate and rendering the service unavailable to 
users. This kind of attack uses a multitude of computers 
to create constant one-way of traffic. 

7) Phishing: 


Phishing is a method deployed by cybercriminals in 
which emails that trick a victim into performing a harmful 
action are disseminated. The receiver may be fooled 
into downloading malware that is disguised as a useful 
document. For example. Often, the receiver is implored 
to click on a link that takes them to a bogus website 
that demands they input sensitive data such as bank 
details. Phishing emails are usually sent to thousands 
of potential victims at random, but some are crafted 
to appeal to specific individuals, attempting to persuade 
them to submit private information. 
8) Public unsecured Wi-Fi network attacks: 

This form of cyber-attack commonly targets Wi-Fi users 
at home or in shopping centers, restaurants, airport, and 
other public places. These networks are not password 
protected. They are therefore far more vulnerable to 
cyber-attacks than private, password-protected networks. 
Cyber-attacks can target these networks and abstract the 
information that is transmitted across them. 


B. Defending Against Cyber Attacks 


Security is one of the biggest problems associated with IoT 
devices. In the rapidly developing Internet of Things, they are 
able to do this easier than ever before. Because IoT devices 
are intimately connected, all a hacker needs to do is find 
one vulnerability and they gain control of the entirety of the 
data. To recognize or manage cyber-attacks, it is important to 
understand the frailties of the network. It is also necessary 
for the cyber security partners to ascertain the motivations of 
the attacker, thereby identifying the nature of the data at risk 
and understanding why the attack took place. In the previous 
part, attack types and motivations were discussed. In this part, 
we explore some significant methods used to protect systems 
against cyber-attacks. They are as follows: 


1) The setting of strong passwords: 
More than 75% of business-related cyber-attacks target 
connected networks that use weak passwords. Enacting 
tighter password regulations on all devices, including 
networks, computers, monitoring cameras, etc., can assist 
in enhancement of a business’ security level. 


e Here are some tips for creating strong passwords: 

e Establish and implement a password system for any- 
one entering network resources. 

e Do not use a similar password for various accounts 
or devices. 

e Do not save passwords on publicly used devices. 

e Do not log into systems that use unsafe or unen- 
crypted servers, like public Wi-Fi. 

e Create passwords that are hard to replicate by apply- 
ing symbols, numbers, and uppercase letters. Also 
avoid using simple words or sequences that are easy 
to guess, like “123456”, “password”, or any other 
easily guessed combinations. 

e Prevent unwanted or unnecessary access to certain 
information by not disseminating your passwords. 
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e Alter the default corporation passwords to your pri- 
vate individual ones as soon as you buy a device. 


2) Keep your software up to date: 
If you wish to protect your IoT devices from cyber- 
attacks, you should check that your software is up-to- 
date. Out of date software may be riddled with flaws that 
permit hackers to access information. Companies often 
release software updates with the intention of defending 
their products against possible misdeeds (and, of course 
too, enhance features), so when your IoT device pro- 
ducers send you updates, don’t forget to install them—it 
might fix a security flaw. Always put your devices into 
the automatic software renewal setting, or simply go 
to their websites and check for the newest update. Be 
sure to regularly download updates and install them to 
your device. This will guarantee that your devices are as 
updated as guarded as they can be. Always strive to run 
the latest and most trustworthy software . 

3) Using anti-virus: 
Anti-virus software detects well-known malware and 
scans for unknown files and URLs that include are 
included the blacklist. However, using anti-virus software 
alone does not sufficiently protect your devices and 
networks against cyber-attacks [21]. 

4) Avoid using public Wi-Fi: 
Networks One of the greatest dangers associated with 
free Wi-Fi is that hackers can infiltrate the connection 
between you and the source. In this case, instead of 
communicating instantly with the hotspot, your data may 
be sent to the hacker. Generally, using public Wi-Fi 
networks is not a good idea. If you are to use them, it 
is recommended that you use a VPN or SSL connection 
[22]. 

5) Network firewall: 
A firewall is a network protection system that filters 
transfer streams in and out of networks according to 
a set of predefined security rules. A network firewall, 
divided into software and hardware, generates a wall that 
isolates the reliable network from unreliable networks and 
prevenunauthorized access into the system[24] 


V. CONCLUSION 


We proposed in this paper a digital forensic system called 
SISC that can help forensic investigators short-list the primary 
suspects of a crime by predicting the likely ones to have 
committed the crime. It helps them focus their investigation 
on a small and tightly defined group of suspects. We studied 
the different logical and physical acquisition techniques for 
mobile devices. From the results obtained, we conclude that 
no single tool can provide complete insights of the device; 
hence it is better to use more than one tool in order to help 
the forensic investigators solve a case in Mobile Forensics. 
IoT devices are being used in almost every area of life but still 
majority of IoT devices are vulnerable to attacks. There is a 


lack of literature demonstrating an end-to-end implementation 
of real-time IoT attack detection. This paper evaluated the 
effectiveness of detecting only known attacks. New attacks 
are being launched constantly and this implementation is still 
limited in auto-learning such new attacks and to detect them. 


REFERENCES 


[1] Sneha C Sathe and Nilima M Dongre “Data Acquisition Techniques in 

Mobile Forensics” ,Proceedings of the Second International Conference 

on Inventive Systems and Control (2018 IEEE) 

[2] Yamini Konduru , Dr. Nishchol Mishra , Dr. Sanjeev Sharma “Acquisi- 

tion and Analysis of Forensic Data Artefacts of Some Popular Apps in 

Android Smartphone”,( 2018 IEEE) 

[3] Kamal Taha , Paul D. Yoo “A Forensic System for Identifying the 

Suspects of a Crime with No Solid Material Evidences”, (2018 IEEE) 

[4] Songyang Wu, Xiong Xiong , Yong Zhang, Yang Tang, Bo Jin,” A 

General Forensics Acquisition for Android Smartphones with Qualcomm 

Processor” , (2017 IEEE) 

[5] Ghazi Abdalla Abdalrahman , Hacer Varol “Defending Against Cyber- 

Attacks on the Internet of Things”, (2019 IEEE) 

[6] Sengul Dogan and Erhan Akbal: ” Analysis of Mobile Phones in Digital 

Forensics”, MIPRO 2017,May 22- 26,2017,Opatija,Crotia. 

[7] Khawla Abdulla Alghafli,Andrew Jones and Thomas Anthony Mar- 

tin: ’Forensics Data Acquisition Methods for Mobile Phones ”, The 

7th International Conference for Internet Technology and Secured 

Transactions(ICITST-2012). 

[8] Al-Jarrah, O., Alhussein, O., Yoo, P., Muhaidat, S., Taha, K., and 

Kim, K. “Data Randomization and Cluster-Based Partitioning for Botnet 

Intrusion Detection”. IEEE Transactions on Cybernetics. (2015, IEEE). 

[9] Cha, S., Tappert, C. ”A Genetic Algorithm for Constructing Compact 

Binary Decision Trees”. Pattern Recognition Res., 2009. 

[10] Hosmer, David, Applied logistic regression. Hoboken, New Jersey: 

Wiley,2013. 

[11] Nikulin, M.S. (1973). *Chi-squared test for normality”. In: Proceedings 

of the International Vilnius Conference on Probability Theory and 

Mathematical Statistics. 

[12] Greenwood, P.E., Nikulin, M.S. (1996) A guide to chi-squared testing. 

Wiley, New York. 

[13] Andrew Hoog:” Android Forensics: Investigation,Analysis and Mobile 

Security for Google Android” ,Syngress,201 1. 

[14] Abdalazim Abdallah Mohammed Alamin and Dr.Amin Babiker A/Nabi 

Mustafa: ”A Survey on Mobile Forensic for Android Smartphones”, 

IOSR Journal of Computer EngineeringIOSR-JCE). 

[15] Khawla Abdulla Alghafli,Andrew Jones and Thomas Anthony Mar- 

tin: ’Forensics Data Acquisition Methods for Mobile Phones ”, The 

7th International Conference for Internet Technology and Secured 

Transactions(ICITST-2012). 

[16] R. Porkodi and V. Bhuvaneswari, "The Internet of Things (IoT) Ap- 

plications and Communication Enabling Technology Standards: An 

Overview,” 2014 International Conference on Intelligent Computing 

Applications, Coimbatore, pp. 324-329, 2014. 

[17] C. Links, "The Internet of Things will Change our World,’ ERCIM 

EEIG, Sophia Antipolis Cedex, France, 2015. 

[18] M. Haghi, K. Thurow and R. Stoll, "Wearable Devices in Medical 

Internet of Things: Scientific Research and Commercially Available 

Devices,” Health Inform Res, vol. 10, no. 4258, pp. 4-15, 2017. 

[19] S. Best, ’mirror.co.uk,” What is wearable technology? Everything you 

need to know about the popular gadgets, 3 May 2018. 

[20] H. Teymourlouei, ” Quick Reference: Cyber Attacks Awareness and 

Prevention Method for Home Users,” World Academy of Science, 

Engineering and Technology International Journal of Computer and 

Systems Engineering, vol. 9, no. 3, 2015. 

[21] P. Rubens, ”Anti-Virus Isn’t Enough: 7 Steps to Discourage Hackers,” 

esecurityplanet.com, 6 March 2013. 

[22] J. Dolly, ” Why you should never, ever connect to public WiFi,” csoon- 

line.com, 9 January 2018. 

[23] 23. L. James, ”5 Tips for Securing Your Smart Devices and IoT 

Devices,’ makeuseof.com, 15 November 2018 

[24] S. Mansuri, "Security measures to protect your IoT devices,” jaxen- 
ter.com, 12 October 2018. 


Athul Babu ef al., Findings on Digital Forensics and IOT Devices 


94 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur — 680501 


IoT: A Revolutionary Approach 
for Future Technology Enhancement 


Greeshma Gopinathan, Hiba Mohamed M P 
and Shaheena A H 
Vidya Academy of Science & Technology 
Thrissur - 680501, Kerala 


Abstract—The Internet of Things (IoT) envisions pervasive, 
connected, and smart nodes interacting autonomously while 
offering all sorts of services.Wide distribution, openness and 
relatively high processing power of IoT objects made them an 
ideal target for cyber attacks. 

As many of IoT nodes are collecting and processing private 
information, they are becoming a goldmine of data for malicious 
actors. Security and specifically the ability to detect compromised 
nodes, together with collecting and preserving evidences of an 
attack or malicious activities emerge as a priority in successful 
deployment of IoT networks. 

Index Terms—Internet of Things, digital forensics, cyber secu- 
rity, BlockChain, integrity, security 


I. INTRODUCTION 


HE INTERNET OF THINGS (IoT) integrates various sen- 
Ts objects and smart nodes that are capable of com- 

municating with each other without human intervention. 
The objects/things function autonomously in connection with 
other objects. IoT nodes are capable of delivering lightweight 
data, accessing and authorizing cloud-based resources for col- 
lecting and extracting data and making decisions by analysing 
collected data. The emergence of IoT has led to pervasive 
connection of people, services, sensors and objects. IoT de- 
vices are now deployed in a wide range of applications from 
smart grids to healthcare and intelligence transport systems. 
Huge business opportunities that exist within IoT domain 
significantly increased number of smart devices and intelligent, 
autonomous services offered in IoT networks. Moreover, re- 
liance of IoT devices on cloud infrastructure for data transfer, 
storage and analysis led to development of cloud-enabled 
IoT networks. Security issues such as privacy, access control, 
secure communication and secure storage of data are becoming 
significant challenges in IoT environment. Moreover, every 
single device that we create, every new sensor that we deploy, 
and every single byte that is synchronized within an IoT envi- 
ronment may at some point come under scrutiny in the course 
of an investigation. The fast growth of IoT devices and services 
led to deployment of many vulnerable and insecure nodes. 
Moreover, conventional user-driven security architectures are 
of little use in objectdriven IoT networks. 
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Fig. 1. IoT applications 


There now is considerable interest in the Internet of Things 
(IoT) as an evolution of data communications that allows 
direct, persistent, and automated device-to-device communi- 
cation (also known as Machine-to-Machine [M2M] communi- 
cation or Cyber-Physical Systems [CPSs] communication). The 
principal applications of blockchains to date have been for fi- 
nancial transactions execution, smart contracts, and cryptocur- 
rencies. However, new potential applications are emerging. 

The IoT application space is significant.The IoT endeavors 
to add computer-based logic to a large universe of objects 
or things, which can then be monitored and/or controlled 
by a centralized (often cloud-based) analytics or management 
engine; remote objects are almost invariably connected using 
wireless networks. In IoT, devices and entities in the physical 
world are afforded a digital representation.This digital ‘wrap- 
per’ enables interaction with Information and Communications 
Technology (ICT) elements located on a Local Area Network 
(LAN), at the other end of a Wide Area Network (WAN), or 
on a public-, private or hybrid-cloud. 


II. IOT APPLICATIONS 


IoT applications have focused on two broad areas: industrial 
automation in the context of process control in factories, 
(Industrial IoT); and sensing applications of all sorts, including 
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power grid administration, traffic monitoring, ITSs, smart 
cities, video surveillance, body area networks/e-health, and 
crowdsensing.These two areas deal a lot with the physical 
aspects of the sensors,the wireless link, and to some degree, 
with analytics.There is a Third class of applications that deal 
less with the physical nature of the sensors themselves and 
a lot more with the data analytics: these applications address 
the fundamental transformation of Business Processes (BPs) 
related to common commercial functions such as banking, 
insurance, enterprise and organizational operations (includ- 
ing government functions), and healthcare delivery optimiza- 
tion.Given the scope of the application space, Security in the 
IoT environment is considered critical, especially under the 
circumstances of (typically) limited computational memory, 
power and control capabilities of the end-nodes and the phys- 
ical exposure of these end-nodes. While security is certainly 
important for the first two classes of applications listed above, 
it is absolutely critical for those business-oriented applications 
that almost invariably deal with Personally Identifiable Infor- 
mation (PII). 


III. IOT FACTORS IMPACTING SECURITY 


The challenges associated with reliable security in IoT are 
driven by the following factors: 


e IoT/CPS technology and systems are relatively new and 
are, therefore, less well understood than traditional IT 
systems. 

e IoT/CPS systems are almost invariably distributed over a 
wide (regional) geography, typically in uncontrolled open 
environments. 

e IoT/CPS systems are often administratively federated, 
in the context of multiple heterogeneous environments, 
processes, and technologies, not to mention the diffuse 
security mechanisms often in place. 

e IoT/CPS systems are currently deployed insularly across 
vendor-specific vertical applications, creating fragmented 
technology and administrative silos. 

e End-to-end comprehensive standards for architecture, net- 
working, or security have not been developed, stabilized, 
adopted, or implemented; standardization would enable 
simplicity and the ability to integrate systems (including 
security) from best-in-breed vendors. 

e IoT/CPS endpoints in different (vertical) applications 
often use different addressing models and addressing 
formats, creating complexity. 

e IoT/CPS Operating Systems (OS) may typically have 
streamlined feature sets that limit functional capabilities 
and/or sophistication. 

e IoT/CPS systems often employ inexpensive, low com- 
plexity nodal platforms with limited computational power 
and memory, thus precluding or limiting the use of an on- 
board heavy-duty firewall, and. 

e IoT/CPS endpoint systems have limited electrical power 
(typically being battery-driven). 


IV. BLOCKCHAIN 


The concept of blockchains is now receiving considerable 
research and practical interest. Blockchains provide data in- 
tegrity across a large number of transactional parties by pro- 
viding all participants in the ecosystem with a working proof 
of decentralized trust; classically, this assurance of integrity 
had to be achieved by utilizing a trusted third party to ‘escrow’ 
elements of the transaction — a blockchain replaces this trusted 
third party.A Blockchain is a cryptographically-linked list of 
blocks created by nodes, where each block has a header, 
the relevant transaction data to be protected, and ancillary 
security metadata (e.g., creator identity, signature, last block 
number, and so on.).It facilitates “decentralized consensus” by 
being a distributed ledger (which is effectively a distributed 
database), that retains a(n expanding) list of records, while si- 
multaneously precluding revision or tampering of such records 
retrospectively. Because blockchains are intrinsically resistant 
to modification of the underlying data, they are perceived 
as embodying a tamper-resistant incorruptible decentralized 
digital ledger for economic or logical transactions related to 
virtually anything of value. 
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Fig. 2. Transaction flow in blockchain 
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The blockchain provides universal accessibility, incorrupt- 
ibility, openness and the ability to store and transfer data in 
a secure manner.A blockchain can support a wide range of 
tasks, including allowing parties to draw up trustworthy con- 
tracts, storing sensitive information, and transferring money 
safely—all without the intervention of an intermediary.A 
blockchain is an “open platform”, a distributed system where 
the processes are open to examination and elaboration. It is a 
ledger of data, replicated across a plurality of computers orga- 
nized in a Peer to Peer (P2P) network.A Blockchain records 
the transactions on a multitude of distributed hosts, given that 
a replicated, decentralized database effectively eliminates the 
possibility of global data corruption (deliberate or accidental). 
The blockchain is a time-stamped database that retains the 
complete logged history of transactions on the system; each 
transaction processor on the network or system retains their 
own local copy of this database and consensus-formation 
algorithms allowevery copy, no matter where such copy is, 
to remain synchronized.In a blockchain, a P2P network is 
required as well as consensus algorithms to ensure replication 
across nodes are undertaken. Peers support the state of the 
distributed ledger. 
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The P2P function implies that there is no central control in 
the blockchain-secured network and all nodes can communi- 
cate directly with each other using an appropriate protocol, al- 
lowing for transactions (e.g., documents, data, cybercurrency) 
to be exchanged directly among the peers. There typically 
are two types of peers: Endorsing peers and Committing 
peers.Endorsing peers simulate the transaction execution: they 
execute and endorse the transaction; endorsement policies 
specify the rules for the transaction endorsement. Committing 
peers receive transactions endorsed by endorsing peers, verify 
these transactions, and update the ledger — they may also 
be Orderer nodes that receive transactions from endorsers, 
sequence them, and forwards these transactions to committing 
peers. 

The Internet of Things (IoT) is a smart network which 
connects all things to the Internet for the purpose of ex- 
changing information with agreed protocols. So, anyone can 
access anything, at anytime and from anywhere.The Internet of 
Things (IoT) is the next technological leap that will introduce 
significant improvements to various aspects of the human 
environment, such as health, commerce, and transport.The 
IoT combines a wide range of technologies, such as sensors, 
processing ability, Internet, cloud computing as well as many 
communication infrastructures. 


V. BLOCKCHAIN AND IOT 


Blockchain and IoT are both often mentioned as important 
Digital Transformation technologies. Blockchain adoption in 
combination with IoT adoption was called a DX sweet spot 
by Gartner end of 2019, especially in the US. As Internet of 
Things applications are by definition distributed it’s only nor- 
mal that the distributed ledger technology, which blockchain 
is, will play a role in how devices will communicate directly 
between each other (keeping a ledger and thus trail of not just 
devices but also how they interact and, potentially, in which 
state they are and how they are ‘handled’ in the case of tagged 
goods). 


a 


THE INTERNET OF THINGS 
AND BLOCKCHAIN 


Fig. 3. IoT and BlockChain 


Blockchain is designed as a basis for applications that 
involve transaction and interactions. These can include smart 


contracts (smart contracts are automatically carried out when a 
specific condition is met, for instance regarding the conditions 
of goods or environmental conditions) or other smart applica- 
tions that support specific Internet of Things processes. This 
way blockchain technology can improve not just compliance 
in the IoT but also IoT features and cost-efficiency. 


VI. IOT AND BLOCKCHAIN CHALLENGES TO SOLVE 


Forrester’s Martha Bennet confirmed that it’s time to start 
looking at that IoT and blockchain convergence, even if the 
combination of both might not be for today.Forrester analyst 
Martha Bennett co-authored the report “Disentangle Hype 
From Reality: Blockchain’s Potential For IoT Solutions’.She 
defines three categories of challenges that Internet of Things 
and blockchain ecosystems participants must address 


1) Technology: 
whereby mainly security comes in the picture. In an 
Internet of Things context where IoT security is already 
a challenge, it’s clear that security needs to be even more 
looked at. It is important to note though that blockchain 
is also seen as a way to secure the Internet of Things 
and, as mentioned, security overall but that is another 
discussions with several opinions and aspects to cover. 

2) Operational challenges: 
The business model and the practical aspects as this 
requires many agreements and of course many actors too 
in a broad ecosystem. Just think about that IBM logistics 
example. 

3) Legal and compliance issues: 
Bennet among others refers to responsibility issues in 
case of actions that are taken by devices, based on 
a rule that is automatically executed by a blockchain- 
based application, triggered by another blockchain-based 
application (you see the complexity). And then there is 
the mentioned example of smart contracts. As you know 
contracts are far from easy, even outside this IoT and 
blockchain context. 


VII. SECURITY REQUIREMENTS IN IOT 


1) Confidentiality: 
This term covers two related concepts. First, it signifies 
that unauthorized services must not access private infor- 
mation. Secondly, it assures the protection of privacy and 
proprietary information. 

2) Integrity: 
Integrity means that information and the IoT devices 
cannot be modified or utilized, by unauthorized users and 
objects. 

3) Availability: 
Availability implies that the computing resources and 
information should be available when they are needed 
by a service. 

4) Authenticity: 
Authenticity assures that the information and transactions 
are genuine. 
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VIII. SECURITY CHALLENGES IN IOT 


1) Automatic control: 

The traditional information systems require the users to 
configure them. 

2) Information Volume:Many IoT applications such as the 
smart grid and smart city process a huge volume of 
sensitive and personal information, which is a potential 
target of an ever-increasing number of security threats. 

3) Interoperability: 

The development and the use of security mechanisms in 
the IoT should not largely limit the functional capabilities 
of the IoT devices. 

4) Resource Constraints: 

The devices in the IoT are characterized by constrained 
resources in memory and computation; therefore, they 
may not support the expensive operations of the con- 
ventional security measures, such as the asymmetric 
encryption. 

5) Resilience to physical attacks and natural disaster: 

The IoT devices are typically small with limited or no 
physical protection. For instance, a mobile or a sensor 
device could be stolen, and the fixed devices could be 
moved or destroyed by natural disasters. 

6) Privacy protection: 

Typically, the IoT devices include sensitive data which 
must be secured and not be identifiable, traceable and 
linkable. 

7) Scalability: 

The IoT networks usually involve an enormous number 
of objects. Therefore, the security and privacy protection 
mechanisms should be able to scale. 


IX. SECURITY THREATS IN IOT 


1) Sinkhole Attack: 
In this attack, malicious node at-tracts network traffic 
towards it. To launch these types of attack, a malicious 
node attract all adjacent nodes to forward their packets 
through the malicious node by showing its routing cost 
minimum. 

2) WormHole Attack: 
In this attack, the adversary node creates a virtual tunnel 
between two ends . An adversary node acts as a forward- 
ing node between two actual nodes. The two malicious 
nodes usually claim that they are one hop away from the 
base station. 

3) Selective Forwarding Attack: 
In this attack, malicious node acts as a normal node but 
it is electively drops some packets . Black hole attack is 
the simplest form of selective forwarding attack in which 
all packets are dropped by the malicious node. 

4) Sybil Attack: 
In this attack, the node has multiple identities. The routing 
protocol, detection algorithm and co-operation processes 
can be attacked by a malicious node. 

5) Hello Flood Attack: 


In a sensor network, the routing protocol broadcast hello 
message to announce its presence to its neighbour . A 
node which receives the hello message may assume that 
the source node is within its communication range and 
add this source node to its neighbour list. 
6) Denial of Service (DOS) Attack: 

This attack can damage the availability of resources. 
When this attack is made, resources are not available to 
legitimate users. Such type of attacks, when launched by 
various malicious nodes is called DDoS. This attack may 
affect the network resources, bandwidth, CPU time etc. 


X. CHALLENGES AND OPEN ISSUES 


At present, only a very limited IoT support is available, and 
the following key challenges exist. 


1) Network Foundation: 
Limitations of the current Internet architecture in terms 
of mobility, availabilityymanageability and scalability are 
some of the major barriers to IoT. 

2) Security, Privacy and Trust: 
Security the challenges are: 


e securing the architecture of IoT 

e proactive identification and protection of IoT from 
arbitrary attacks and abuse 

e proactive identification and protection of IoT from 
malicious software 


Privacy challenges are: 


e control over personal information and control over 
individual’s physical location and movement 

e need for privacy enhancement technologies and rel- 
evant protection laws 

e standards, methodologies and tools for identity man- 
agement of users and objects 


Trust challenges are: 


e Need for easy and natural exchange of critical, pro- 
tected and sensitive 

e trust has to be a part of the design of IoT and must 
be built in 


XI. CONCLUSION 


When we look at today’s state of the art technologies, we 
get a clear indication of how the IoT will be implemented on a 
universal level in the coming years. We also get an indication 
of the important aspects that need to be further studied and 
developed for making large-scale deployment of IoT a reality. 
It is observed that an urgent need exists for significant work 
in the area of governance of IoT. Without a standardized 
approach it is likely that a proliferation of architectures. 

Identification schemes, protocols and frequencies will hap- 
pen parallely, each one targeted for a particular and specific 
use. This will inevitably lead to a fragmentation of the 
IoT, which could hamper its popularity and become a major 
obstacle in its roll out. Interoperability is a necessity, and inter- 
tag communication is a pre-condition in order for the adoption 
of IoT to be wide-spread. 
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Recent advancements in IoT have drawn attention of re- 
searchers and developers worldwide. IoT developers and re- 
searchers are working together to extend the technology on 
large scale and to benefit the society to the highest possible 
level. However, improvements are possible only if we consider 
the various issues and shortcomings in the present technical 
approaches. 

The positive impact of the IoT on citizens businesses 
and governments will be significant, ranging from helping 
governments reduce healthcare costs and improving quality 
of life, to reducing carbon footprints, increasing access to 
education in remote underserved communities, and improving 
transportation safety. 

In this research paper , we discussed about Internet 
of things,how it works. We also described about what is 
BlockChain and its concepts and its working.We have also 
mentioned about how BlockChain and IoT are interrelated with 
one another. 

In this paper, we discussed about How lot Factors impact 
Security,about BlockChain and IoT problems to solve. We 
mentioned about Security Requirements,Security Challenges 
and Security Threats in IoT and also about Challenges and 
Issues in IoT. 
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Abstract—With the continuous advancement in technical field 
many technologies are evolving day by day, cloud computing is 
one of them. With the help of cloud computing user can easily 
share, store and retrieve their data from anywhere. Cloud com- 
puting provides hardware, software and infrastructural storage 
to many users at a time. As many users share their data on 
a cloud the main question is about security of data present on 
cloud. In this research paper solution is provided to maintain 
data security and data integrity. This scheme contains some of 
algorithms like DES, 3DES, AES,Blowfish, Diffie-Hellman key 
exchange and RSA. In this solution data is encrypted by RSA 
before uploading it on cloud server. 

Index Terms—DES, 3DES, AES, Blowfish,Diffie-Hellman key 
exchange, RSA. 


I. INTRODUCTION 


20 century.It was associated with john McCarthy.Actual 

application appeared at the beginning of 2000 through 
Microsoft and then followed by Google.Cloud Computing can 
define as a set of services provided to a client or multiple 
clients over the internet to take advantage of the capabilities 
of the service provider without having to purchase expensive 
hardware in the company to do the same tasks.Cloud provides 
many benefit based services and applications as cost-saving, 
scalability, flexibility, reliability, maintenance, and mobile- 
accessible It is the ability to access, share a collection of 
sources that are owned and stored by another party over the 
internet Once information is shared over internet we have 
to consider security problems-confidentiality, integrity, and 
authentication. 


JE of cloud computing dates back to 1960 s from 


II. TYPES OF CLOUD ARCHITECTURE 


There are two types of cloud architecture: 
1) Service Model 


e Saas-Software as a service-enables users to ac- 
cess service through operating simple software as a 
browser like Gmail. 

e Paas-Platform as a Service-enable their users to de- 
velop applications and deploy them, like Google App 
Engine. 

e laaS-Infrastructure as a Service-enable their users to 
access the computational and storage infrastructure 
in a central service. 


in Cloud Computing 
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Fig. 1. Some of the different types of cloud security challenges 


2) Deployment model: 


e Public cloud provides all computing applications and 
resources to people or a large organization group by 
the single service provider. 

e Private cloud infrastructure leases to a single organi- 
zation so that it works for itself Community cloud- 
infrastructure shared between different organizations. 

e Hybrid cloud infrastructure that combines public and 
private cloud models. 


II. SECURITY ISSUES 


Cloud computing security issues are one of the biggest 
challenges that lead to delaying cloud adoption.First, the 
availability of data regardless of user location.Second, the 
integrity of data -transmitted messages are the same as the 
received.Third, the confidentiality - avoiding the illegal user 
from access.Security of the user’s data depends on the cloud 
provider’s responsibility.It focus on cryptography methods by 
making a comparison between many algorithms to provide se- 
cure data encryption to get efficient data security.Cryptography 
is the discipline that provides a secure data transmutation 
then retrieve the data by using a specific channel.It includes 
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two main encryption process to convert the data text to 
unintelligible data -plaintext to ciphertext. 


IV. SECURITY ALGORITHMS IN CLOUD 


Secure cloud computing over the network need to an 
encryption algorithm.Encryption algorithm is the fundamental 
tool for protecting the data. Here it compares different types 
of algorithms and illustrate it as follows : 


A. Symmetric Algorithms 


Symmetric Algorithms use one key to encrypt and decrypt 
data. It depends on the key used which means the person who 
has the key can decrypt and read the content of messages or 
files .Example Data Encryption Standard (DES), BLOWFISH, 
and Advanced Encryption Standard (AES).Asymmetric Algo- 
rithms has two keys the first one is the Public Key (PB) and 
the second one is the Privet Key (PK). The PB encrypts the 
messages, and the PK decrypts the messages .Example RSA, 
Diffie-Hellman key Exchange (D-H9), and DSA. 


V. IMPORTANT SECURITY ALGORITHMS 


Different algorithms that were proposed to treat the security 
problems in communication, data anonymization , data stored 
in the cloud storage.It also found that various types of algo- 
rithms can protect the cloud by different levels and different 
uses. The selected algorithms are DES, 3DES, AES, Blowfish, 
Diffie-Hellman key exchange and RSA. 


VI. DETAILS OF ALGORITHMS 
A. Data Encryption Algorithm (DES) 


DES (Data Encryption Algorithm, DEA) encryption algo- 
rithm is the most widely used data encryption system. It 
was established by the National Bureau of Standards (NBS) 
in 1977, which is now the National Institute of Standards 
and Technology (National Institute of Standards and Tech- 
nology.Symmetric encryption algorithm is an earlier type 
of data encryption algorithm, which has advantages of fast 
encryption speed and high encryption efficiency.The data 
encryption standard is a symmetric encryption algorithm using 
key encryption. It is based on a symmetric algorithm using 
56-bit keys. DES is a typical representative of block ciphers. 
Because the key length of the DES algorithm is too short, the 
DES algorithm is no longer secure, and currently DES has 
been replaced by AES. 

1) Principle of DES Encryption Algorithm: The data en- 
cryption standard is a symmetric encryption algorithm using 
key encryption.It is based on a symmetric algorithm using 
56-bit keys.DES is a typical representative of block ciphers. 
Because the key length of the DES algorithm is too short, 
the DES algorithm is no longer secure, and currently DES 
has been replaced by AES.DES is a symmetric encryption 
algorithm. It has two inputs, plain texts and keys. The length 
of the plaintext and the keys are both 64 bits, and 56 of them 
are valid keys, and the remaining 8 bits are the parity bit. 


2) Improvements FOR The DES Alogrithm (3DES algo- 
rithm ): DES is using three different keys to perform 
*encryption-decryption-encryption on plain text.processing 
speed of the 3DES algorithm is not fast, and it’s rarely 
be used for new purposes except for the case of paying 
special attention to backward compatibility.This improvement 
uses the idea of the 3DES algorithm to extend the key 
length without losing efficiency.DES is using three different 
keys to perform ’encryption-decryption-encryption on plain 
text.processing speed of the 3DES algorithm is not fast, and 
it’s rarely be used for new purposes except for the case 
of paying special attention to backward compatibility.This 
improvement uses the idea of the 3DES algorithm to extend 
the key length without losing efficiency. 
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Fig. 2. Schematic diagram of DES improved algorithm 


Figure 1 shows the main encryption process of the improved 
DES algorithm. 


Step 1. The length of the plaintezt packet is changed from the 
original 64bits to 128bits. 

Divide the data of each group of 128 bits into the left 
group L and the right group R,each group is 64 bits. 
The left group L and the right group R are respectively 
encrypted with two different sets of keys KI and Kr.. 
The two sets of data obtained after each iteration: 
The LI part in the left packet L and the Lr part in 
the right packet R are exchanged, and the exchanged 
result is used as the input of the next iteration to 
continue encryption. By analogy, the results of each 
iteration are exchanged. 

Perform 16 rounds of iterative transformation accord- 
ing to process above and merge the transformation 
results into a data block with a length of 128 bits as 
the result of this group of encryptions. 


Step 2. 
Step 3. 


Step 4. 


Step 5. 


B. Advanced Encryption Standard (AES) 


The AES algorithm (also known as the Rijndael algorithm) 
is a symmetrical block cipher algorithm that takes plain text 
in blocks of 128 bits and converts them to ciphertext using 
keys of 128, 192, and 256 bits.Since the AES algorithm is 
considered secure, it is in the worldwide standard.There are 
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also three main operations performed in AES algorithm which 
are Encryption, Decryption and key Generation. 


1) Encryption Process: 


e The key size as well as the plaintext is obtained for 
encryption. 

e Pre-Round transformation is performed on the plain 
text. 

e ‘n’ rounds are performed depending on the key size. 

e The cipher text is obtained after ’n’ rounds. 


2) Decryption Process: 


e The key size as well as the cipher text is obtained 
for encryption. 

e Pre-Round transformation is performed on the cipher 
text 

e ‘n’ rounds are performed depending on the key size. 

e The plain text is obtained after ’n’ rounds. 


(In order to perform decryption, all the steps in encryption 
is performed in reverse order...... ) 
3) Key Generation: 


e Get the key 

e Number of words needed is computed based on the 
number of rounds. 

e The first four words of 4 bytes array are created based 
on the key. 

e The next word is obtained by performing Root word 
and Sub word. 

e Step 4 is then repeated in order to reach the required 
number of words. 


C. Blowfish Algorithm 


Blowfish is a key symmetric block cipher.It has 64-bit block 
size and the key length varies from 32 bits until 448 bits.It 
uses a large key dependent S-boxes as well as experience a 
16 round Feistel cipher process.The main three operation of 
Blowfish algorithm are the same as DES algorithm which are 
Encryption, Decryption and key Generation. 


1) Encryption Process: 


e X is divided into two parts with equal 32-bit which 
are XL and XR 

e From round 1 until round 16, new XL=XL KOR XR, 
new XR=F(XL) 

e XOR YR; and XL and XR is swapped. 

e XL and XR are XOR ed with P17 and P18 

e New XR=XR XOR P17 while new XL=XL XOR 
P18 

e XL and XR is Combined. 


2) Decryption Process: 
e P17until P18 are used in reverse order but in the same 
process like in encryption process. 
3) Key Generation: 
e P-array is initialized first and then the S-boxes. The 
string used must be fixed and consist of hexadecimal 
digits of pi. 


e Pl is XOR ed with the first 32 bits of key while P2 
with the second bits. This step is repeated for all bits 
of key until the entire P-array is XOR ed with key 
bits. 

e All-zero string will be encrypt using the keys in step 
1 and 2. 

e Pl and P2 is replaced with the output of step 3. 

e The output is then encrypted with the modified sub 
keys. 

e P3 and P4 are replaced with the latest output. 

e The process is repeated until all entries of P-array is 
replaced and S-boxes are all in order. 

A comparative study has been done in this research among 
AES, DES and Blowfish algorithms. It can be seen that 
Blowfish algorithms has best performance on speed among 
these three but there is no any discussion about security on 
various attacks and it could be done the performance on 
networks . 


D. Hybrid Algorithms 


General steps for our proposed hybrid algorithm are as 
follows: 


Step 1. Step 1: Start 

Step 2. Initialize block size and key size 

Step 3. Select preferred key size or key length. KL= [KEY 
size of AES and Blowfish] 

Step 4. KL= total of key size of hybrid algorithms 

Step 5. Select file to be encrypted 

Step 6. Enter the generated key and encrypt 

Step 7. Select the encrypted file for decryption 

Step 8. Decrypt cipher text to obtain original file 

Step 9. Stop. 


E. Diffie-Hellman Algorithm 


Cloud computing has many security challenge such as 
data integrity , unlicensed access, DoS etc. The traditional 
cryptographic algorithms are used to prevent user’s sensitive 
data from data tempering and unauthorized access.It is the first 
public key encryption algorithm, using discrete logarithms in 
a finite field In order to transmit the keys securely, Diffie- 
Hellman key exchange algorithm is used. The key exchange 
by Diffie-Hellman protocol, by allowing the construction of a 
common secret key over an insecure communication channel. 
This algorithm can prevent the user data from the MiM attack 
and plain-text attack.it applies algorithms on the distributed 
key with confidential number.This number is used as a core 
value which individualize the cloud member to the communi- 
cating party and to make it secure against MiM attacks and 
plain-text attacks. 

1) MiM (Man-in-the-Middle) Attack: 

This is an active attack in which, the invader prevent the 
message from continuing to the destination. Also, he can 
modify the message by preventing himself as a one of 
the communicating party. 

2) Plaintext Attack: 
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Fig. 3. Comparison among DES, 3DES, AES, Blowfish, RSA and Diffi-Hellman Algorithms 


Diffie-Hellman Key Exchange Agreement/Algorithm 
1 Alice & Bob agree upon 2 large prime numbers 
>> Two parties, can agree on a symmetric key using this technique. = 


>> Itis based on mathematical principles. 


1. Firstly Alice & Bob agree upon 2 large prime numbers -N & 
ese 2 numbers need not be secret & can be shared publicly. | 


random number X(private to her) 


6. Alice now computes her secret key K1 as follows: 
‘mod 


7. Bob computes his secret key K2 as follows: 
iod 


K2= A) me 
8. K1 = K2 (key exchange complete) 


Fig. 4. Schematic diagram of Diffie-Hellman key exchange algorithm 


An invader has plain-text as well as encrypted text in this 
type of attack. If the distributed key is a consistant key, 


the it will generate the same decrypted text for the same 
plain-text. This information is used by an invader to get 
the relationship between the plain-text and encrypted text. 


Both parties can calculate the same secret key as they are the 
only one who knows the numbers n and q. This algorithm is 
vulnerable to MiM attack as they don’t have any mechanism 
to authenticate the parties. Also, the keys remain same for 
the session so it will result into the same decrypted text for 
the plain-text. By using this information an invader can find 
the relationship between the plain-text and cipher text. By 
using this, plain-text attack can be minimized This reduces the 
plaintext attack and MiM attack in Diffie — Hellman algorithm. 
But to design 100% secure key exchange algorithm is not 
easier. 
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F RSA Algorithm 


The RSA (Rivest-Shamir-Adleman) algorithm is the most 
important public-key cryptosystem.It is best known and widely 
used public key scheme. It uses large integers like 1,024 bits 
in size.It has only one round of encryption. It is asymmetric 
block cipher. 


Key Generation 
p and q both prime 
n=pXq 
gcd(p(n), d)= 1:1 < d< o(n) 


Select p, q 
Calculate n 
Select integer d 


Calculate e e= d! mod $(n) 

Public Key KU= {e,n} 

Private Key KR= {d, n} 
Encryption 


Plaintext: M < n 
Ciphertext: C = M! (mod n) 


Decryption 
Ciphertext: C 
Plaintext: M = C4 (mod n) 


Fig. 5. Schematic diagram key generation in RSA algorithm 


RSA is an algorithm used by modern computers to encrypt 
and decrypt messages. RSA is an asymmetric cryptographic 
algorithm. This is also called public key cryptography, because 
one of them can be shared with everyone and another key must 
be kept private. 


G. Comparison of Various Data Security Algorithms 


Figure 3 contains a detailed comparison of the various 
algorithms. 


VII. CONCLUSION 


In this paper we have discussed only some of the important 
cloud security algorithms that can be used according to the 
user requirements. There is a considerable improvement in the 
data communication between the nodes after key management 
techniques have been employed. Among the algorithms used 
it is found that the RSA algorithm is found to be the most 
efficient and can also be used in multi cloud environment.RSA 
algorithm is safe and secure for its users through the use of 
complex mathematics. RSA algorithm is hard to crack since 
it involves factorization of prime numbers which are difficult 
to factorize. Moreover, RSA algorithm uses the public key to 
encrypt data and the key is known to everyone, therefore, it 
is easy to share the public key. 
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Abstract—Augmented Reality is a combination of a real 
and a computer-generated or virtual world. It is achieved by 
augmenting computer-generated images on real world. It is 
of four types namely marker based, marker less, projection 
based and superimposition based augmented reality. It has many 
applications in the real world. AR is used in various fields such as 
medical, education, manufacturing, robotics and entertainment. 
Augmented reality comes under the field of mixed reality. It can 
be considered as an inverse reflection of Virtual Reality. They 
both have certain similarities and differences. This paper gives 
information about Augmented Reality and its applications. 


Index Terms—Augmented reality, interactive learning system, 
collaborative learning environment, occlusion, OSCE, dentistry. 


I. INTRODUCTION 


UGMENTED reality (AR) is the real-time use of in- 

formation in the form of text, graphics, audio and 

other virtual enhancements integrated with real-world 
objects. It is this “real world” element that differentiates AR 
from virtual reality. AR integrates and adds value to the user’s 
interaction with the real world. It is the synthesis of real 
and virtual imagery. It is becoming an emerging platform in 
new application areas for medical training systems, medical 
display, entertainment, games, and so on. 


We’ve all come across Augmented Reality at some point 
or the other in our lives. Be it while playing a game of 
Pokémon Go, amusing ourselves by clicking selfies trying 
out those wild Snapchat filters, decorating our homes through 
the IKEA app, or even while trying out different varieties 
of makeup with the L’Oreal app. That's augmented reality. 
To be present in the real world, yet to be able to interact 
with something that you can see and manipulate which isn’t 
really there. Augmented Reality is a technology that enhances 
the real world by affixing layers of digital elements onto it. 
These elements include computer-generated graphics, sound 
or video effects, haptic feedback, or sensory projects. The 
intention behind adding this digital information is to provide 
an engaging and dynamic customer experience that is enabled 
with the input received from varied hardware like smart glass, 
smart lenses, and smartphones 
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AUGMENTED REALITY VIRTUAL REALITY 


Overlays computer generated 3D re) Visually immerse the user with simulated 
content on the real world objects and environment. 


re) User is able to interact with real world 


re) Completely shut down the real world and make 
and virtual world 


user think that they are really in the virtual world. 


re) User can clearly distinguish between 


re) User finds it hard to differentiate between 
both the worlds, 


virtual and real world. 


It is achieved by smartphones, 
O y 


[j hi Rh > 
tablets or AR wearables. © Itis achieved by VR headsets. 


Fig. 1. Augmented reality vs. virtual reality 


II. MARKERLESS TRACKING ALGORITHM BASED ON 3D 
MODEL FOR AUGMENTED REALITY SYSTEM 


The prior knowledge defined in this paper consists of two 
parts. One is descriptors of feature points, and the other is 2D 
and 3D information corresponding to the scene. 


A. Selection of Feature 


For AR system, stable features are very important. So in 
terms of image feature, first it should be easy to extract, and 
also to distinguish, not to be sensitive to illumination, rotation 
and scale changes. However, the traditional Harris point do not 
have all such characteristics. David G. Lowe first proposed 
the scale space-based local feature descriptor descriptor — 
Scale Invariant Feature Transform (SIFT) in 1999, and it has 
properties of scale invariance, rotation invariance and even 
affine transformation invariance. Later on, he summarized and 
improved the theory in 2004. This kind of feature point can 
maintain a certain invariance in terms of zoom, rotate, scale 
and affine transformation, and also perspective changes, illu- 
mination changes, while maintain a good match in situations 
of objects’ moving, occlusion, noise and other factors. With 
these properties, feature matching between two images with 
relatively large differences can be realized. 


B. Building of 3D Structure Information 


By the aforementioned method, the scene features descrip- 
tor, as well as part of the scene structure information have 
already been obtained. In the following, the correspondence 
between the descriptors and three-dimensional scene will be 
built through movement structure reconstruction method. The 
idea of stratified reconstruction is used here, what is different 
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is that the camera used in this paper is calibrated, thus the 
reconstruction process is simplified. 


C. Definition of Prior Information Structure 


After the above-mentioned 3D reconstruction process, the 
scene’s prior information has already been obtained actu- 
ally. For real-time tracking requirements, an appropriate form 
should be selected to organize and use these priori information. 
For the purpose of easy query and match, also fast speed, the 
basic principle is to use as little description as possible to 
cover more information. 


D. Real-Time Tracking 


After the real-time image is captured, the scene marking 
information is queried first, then feature extraction of the cur- 
rent frame is guided with the returning marking information. 
After this, we’ll get 2D-3D matching through data searching 
in KD-Tree. Then a Ransac will be done to further optimize 
the match to get the initial value of P matrix. 


III. AUGMENTED REALITY (AR) AND VIRTUAL REALITY 
(VR) APPLIED IN DENTISTRY 


With the increase in the elderly population and the economic 
growth, the concept of oral health gradually increased, and 
dental and dental health care issues are increasingly important. 


A. The History of Dentistry 


The history of dentistry is almost as long as the history of 
human civilization. The progress of science and technology,the 
application of technology used in the dental became more and 
more mature. From the initial, using pliers to remove the tooth, 
wire to lock loose teeth, and the dental appliance and dental 
bridge. 

Nowadays, dentists in the United States and European coun- 
tries must pass both written and technical examinations before 
obtaining a license. Dentistry in Japan and China also has 
fully implemented the above mentioned examination policies. 
In view of this, enough practice, professional knowledge in 
medical and dental colleges. The better way of learning is 
without question a developing trend for global dental educa- 
tion. Learning educational equipment and method built around 
such technology will be a must-have for dental universities 
around the world. 

An Objective Structured Clinical Examination (OSCE) is a 
type of examination often used in health sciences. The OSCE 
is a reliable evaluation method to estimate the preclinical 
examination of dental students.The most ideal assessment for 
OSCE is used the augmented reality simulator to evaluate. 


Table 1 Comparison of dental simulators modified from: Elby Roy 2017. 


PerioSim® CDs DentSim™ IDEA 


Ergonomic postures No Yes Yes No Yes 
Instant feedback No Yes Yes Yes Yes 
Exam simulation Yes Yes Yes No Yes 
Direct transfer of data to convenor/tutor Not Yes Yes Yes Yes 
Teeth used Animated Plastic teeth Animated Animated 
Right and left operation Available Yes Available Available Available 


Simodont® 


Fig. 2. Comparison of Dental Simulators 


B. VR and AR Combined the Tracking System in Real-time in 
the Surgery 


The skill of the visual reality became mature, and more and 
more VR and AR showed on the educational and surgical field. 
The development of reality devices allow the user to combine 
the medical information, medical data and incorporate these 
data visualized. It can provide more clear information and 
make the users improve safety and lower risk . Although the 
visual reality and augmented reality in the dental field is not 
enough common, in other field developed much better, such 
as the neurosurgery and cranial surgery . The users can use 
the Head Mount Display (HMD) display to see the medical 
information and images combined to the surgery . And it can 
decrease the surgical risk much better than the common visual 
reality. 

Informative technological advances in dentistry With the 
advanced development of Information Technology (IT), dental 
solutions lead by computer and internet technologies have 
made significant progress all over the world.Digital dental 
solutions will be the trend for the professional dental field in 
the future. The rapid development of digital dental solutions 
has been applied in both the clinical dental field as well as the 
dental education field. 


IV. RESOLVING OCCLUSION IN AUGMENTED REALITY 
BASED ON INVARIANT FOR TWO VIEWS 


Augmented Reality (AR) is the synthesis of real and virtual 
imagery. It is becoming an emerging platform in new appli- 
cation areas for medical training systems, medical display, 
entertainment, games, and so on. Generally, such systems 
only blend virtual imagery with real images and attempt to 
minimize registration errors. 

Augmented Reality (AR) is the synthesis of real and virtual 
imagery. It is becoming an emerging platform in new appli- 
cation areas for medical training systems, medical display, 
entertainment, games, and so on. Generally, such systems only 
blend virtual imagery with real images and attempt to mini- 
mize registration errors. However, this method is not effective 
when occlusions exist between virtual and real objects. This 
occlusion problem could easily be solved on condition that the 
model of 3D scene 

In this paper, we present a contour-based approach without 
3D reconstruction. First, he key points of occluding contours 
between virtual and real objects may be specified interactively 
according to epipolar and other constraints in the first two 
frames. Second, The SIFT and RANSAC algorithm is applied 
to search for the correct point correspondences in any two 
views. With these points, the points of the occluding contour 
C are transferred to any views by the invariant for two views. 
Finally, it is feasible to track the occluding contours in any 
views, so the virtual objects can be drawn behind the contour. 


A. Resolving Occlusion Based on Invariant 


When a virtual object is added into the scene, the user needs 
to specify the relationship between them. For instance, we 
want to add the virtual house A behind the real house B. In 
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order to resolve occlusion, if it is feasible to track the contours 
consisted of key points 1, 2, 3, 4, the virtual house A can be 
drawn behind the contour. If the scene is complicated, we may 
label each contour point as being “behind” or “in front of’, 
this idea depending on whether it is in front of or behind the 
virtual object. 


Fig. 3. The occlusion between real and virtual objects 


According to the above invariant, if position of a point 
on occluding contours is determined in stereo images, the 
position of the point in any images may be calculated using 
the intersection of two lines. The main question is, after the 
position of a point in the first frame is determined, how to fix 
its position in the second frame. As known, if projection of 
point P in a 3D world is specified in one image, its projection 
in the second image must lie on a line satisfying the epipolar 
constraint. At the same time, through point collinearity or 
coplanarity constraint, the position of any point in the second 
frame can be determined. 


B. Experiments 


In order to demonstrate the effectiveness of this method, 
two typical examples were selected and briefly described in 
the following. 

1) Experiment 1: The first experiment is to fuse the vir- 
tual building into real 3D scene which is based on affine 
structure for calibrationfree augmented reality . Fig. is the 
experiment environment. Now, suppose that a high building 
will be constructed. In Fig. (a), a set of salient points are 
extracted using SIFT algorithm. The black lines show all 
correct correspondences based on RANSAC. In Fig. (b) and 
(c), the affine frame is constructed from the image coordinates 
of cross-center in Fig. (b), which are acquired by interactive 
method. Similarly, the image coordinates of key points on the 
occluding contour are obtained in the first image (see the 
four small black squares in Fig. (b)). The epipolar lines of 
these points are computed and drawn in the second image 
(see the four black lines in Fig. (c)). Then, according to 
epipolar and collinearity or coplanarity constraints, the image 
coordinates of key points on the occluding contour in Fig. (c) 
are obtained with interactive method. Furthermore, according 
to the invariant for two views and point correspondences, 
the positions of the occluding contour in other frames are 
determined. In Fig. (d) the virtual building is fused into the 
3D scene very well. 


Fig. 4. Extracted salient feature-points and correct correspondences 


Fig. 5. The first view of the scene 


2) Experiment 2: The second experiment in Fig is to fuse 
building group into a scenic spot. The process of resolving 
the occlusion problem is similar to that in experiment 1. The 
difference between these two experiments is that this occluding 
contour between virtual and real objects (see the two black 
circles in Fig. ) is relatively complicated. Of course, the more 
accurate the occluding contours tracked are, the more key 
points on contours which should be interactively marked in 
the first two frames are needed. 
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Fig. 6. Augmented virtual building 


Å" 


Fig. 7. The view of real scene 


V. USE OF AUGMENTED REALITY IN LEARNING 


The digital revolution gave rise to a need for information 
and contributed to the decline of traditional information and 
knowledge accumulation, processing, and transmission struc- 
tures. It has been indicated earlier that augmented reality can 
be brought about by a variety of devices and platforms. It is 
related to the phenomenon of media convergence. Augmented 
reality utilizes three screens or displays, while out of the 
TV, computer, and mobile telephone trio the phone display 
has the crucial role. A long time has passed since the first, 
perhaps less successful attempts of mobile service providers 
to generate content in a quantity determined by the user. 
Nowadays content quantity and user activity demands can be 
reconciled by the adaptation of a proven and tested model 
to the context of the mobile phone. Thus, users observing 
the existing operational rules and guidelines must be provided 
with complete and unlimited access to the worldwide web 
via a significantly larger and touch operated display screen. 
As a result of this process the smart phone becomes “the 
most personal computer”. Since augmented reality based on 
mobile devices eliminates the need for expensive equipment or 


Fig. 8. Augmented virtual building group 


the acquisition of new knowledge the number of applications 
generating additional interactive layers over the physical world 
is expected to rise. 


VI. AUGMENTED REALITY IN GAME DEVELOPMENT 


Interactivity in a multimedia environment refers to a process 
where a click or touch on a picture or text launches an action 
leading to another context or starts a video, or displays another 
text. The continuous evolution of interaction can be accessed 
in the following manner: Previously, we looked at a picture 
and mentally traced our own personal cognitive associations to 
another image. Now the interactive computerized media calls 
on us to click on a highlighted sentence to reveal another 
image and follow the pre-programmed objectively present cog- 
nitive associations. When new technologies emerge, usability 
plays a significant role in promoting their integration into 
the given social and cultural context. The very concept refers 
to the ease or difficulty of acquiring information needed for 
problem the appropriate and easy use of the given application. 
The evolution of usability can be represented by a continue 
beginning with the mouse and keyboard operated personal 
computers and ending with touch or motion controlled display 
screens. 

The success of Pokémon Go is based on the simplification 
of a complex yet spectacular technology and the promotion 
or enhancement of the user experience. The enhanced ex- 
perience includes not only walks in a virtual space, but for 
example physical discovery of cities and abandoned factories. 
In addition to a purified and simple surface and easy usability 
its most important feature is its ability to display pictures 
embedded in texts thereby enabling the user to enjoy a 
significantly enhanced participatory experience via multimedia 
applications. 


VII. AUGMENTED REALITY IN EDUCATION 


Augmented reality can be applied in education. In geog- 
raphy atlases 3 D models can be presented by using mobile 
devices. This way the scenery comes alive. In Biology atlas 
a human heart may be transformed into a beating, animated 
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virtual organ on the screen. Students are also able to watch 
experiments in a Physics course. Using smartphones and 
tablets they can observe experiments from several angles. 
This way, dangerous experiments may be presented safely. 
However, creating augmented reality content requires a lot 
of time and skill. Augmented reality can be used in various 
ways in learning both at formal and non-formal education. If 
teachers prepare remarkable visual tools, students can consume 
this content easily and with more motivation. Also, students 
can create AR elements related to the materials they focus on 
at the given lesson. When creating their own content, students 
become more involved in learning, learn how to master the 
skills and competences on a higher level 


VIII. AUGMENTED REALITY IN HEALTH AND MEDICINE 


Augmented reality (AR) is an emerging and steadilyma- 
turing technologythat scholarshavebeen applying to study and 
improve health processes and outcomes. This chapter seeks to 
explore the intersections between different disciplinary fields 
in which AR research has explored health questions. The 
breadth of literature in AR and health suggests that there 
are many ways to approach this topic, ranging from AR as a 
medical display, a training mechanism for health professionals 
(i.e., surgical medical residents), a medical intervention (i.e., 
treatment of phobias), or a tool for everyday use in various 
health contexts and cconditions. 


A. Individual Use of AR: AR in Health Interventions 


While AR has been the subject of academic research for 
decades , the earliest publicly available AR applications were 
accessed through mobile handheld devices . Some companies 
then launched headworn AR devices for consumer use (e.g., 
Google Glass, Microsoft HoloLens), as a different form factor 
for AR .Many of these applications that incorporated AR 
elements were geared toward an explicit health ffocus. These 
early studies found that AR applications accessed through both 
mobile handheld and headworn devices were conducive for 
various health intervention purposes. One unique feature that 
is appealing about AR is its mobility for delivering health 
interventions. While there are many health interventions that 
are stationary or tethered to a particular setting, these are 
limited in their effectiveness due to geographic constraints, 
the times they are available to patients, and the unnatural 
environment they are placed. Mobile AR enables certain types 
of health interventions that can be deployed anywhere the 
patient is located and at various times (e.g., inhome AR 
exercises). Mobile technologies are an increasingly promising 
platform for health interventions because they are attractive 
and familiar to patients and because of their ubiquity. A 
popular area for health interventions via AR is the focus 
on physical activity in individuals, ranging from contexts 
of diet and exercise to prevention of falls for older adults 
with paralysis. “Exergames,’ a subgroup of SG, is a term 
thatcombines the word exercise and game, and it refers to 
technologies that promote healthy behaviors by combining 
video game technologies and exercise .These games influence 


players to be physically active to succeed in games, and there 
is growing evidence that exergames help individuals stay fit 
and manage their weight . 


B. AR in Health at Home 


One of the key challenges for healthcare providers is pro- 
viding outpatient care after patients leave a medical facility. 
AR offers one technology to remedy this problem. AR has 
been utilized as a tool for occupational therapists to walk 
through a home and visualize modifications that may need 
to occur to facilitate mobility and prevent falls for stroke 
patients . AR has also been considered as a mobile interface 
for controlling smart home functions and appliances, which 
may be particularly useful for older adults or individuals with 
physical disabilities . Each of these applications considers 
ways that AR can facilitate changes in the home, whether it 
is by helping therapists improve the spaces where people live 
orimproving a patient’s ability to control the functions of their 
home. 


C. AR in Health Education 


Medical education also lends itself to some of the unique 
capabilities of AR, for example, understanding human body 
structure. Anatomical training requires dissection of the human 
body, which enables the students to study the human body 
structure in 3D. In traditional anatomy, this training requires 
either a real human cadaver that is costly in price and limited 
in number body.AR has also been conceptualized as an alter- 
native way for anatomical education of medical students, using 
real-time visualization that is integrated in situations with 
relatively lower cost . AR has been utilized to help medical 
students envision the anatomical structure, organ dynamics, 
surgical procedures, and diagnoses of patients . 

Many of the first AR applications were geared toward 
visualizing medical diagrams and anatomical displays, such 
as the brain, nervous system, skeletons, and teeth .Given 
the ability for AR to depict 3D overlays, some of these 
applications allowed users to move around a particular object 
and see it from multiple perspectives, zoom into specific areas, 
and pull apart displays to see inside a particular system. Early 
work in this area has examined whether the interactive and 3D 
nature of AR anatomy displays could help medical students 
learn and remember more information from their medical 
training . 

The visual nature of AR also lends itself to training medical 
professionals for specific operations and procedures. A Rhas 
been found to be an effective tool training health professionals 
how to place an injection needle and how to position an 
intraoral distractor. 


IX. AR FOR REMOTE TELEMEDICINE 


As a field, one of the key problems in medicine is matching 
expertise to geographic/local conditions. Specifically, there are 
well-documented issues of medical staff shortages in rural 
areas, as well as equipment disparities . Beyond the broader 
issue of geographic disparities, there are also certain operations 


Anjaly Jayaraj et al., “Augmented Reality” 


110 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur - 680501 


and surgeries where the pool of experts is so small that they 
are not physically able to travel to all the places where the 
operation needs to take place. In these cases, telemedicine has 
been a proposed solution, allowing healthcare professionals to 
evaluate, diagnose, and are displayed as well as the resulting 
vector using a 3D CAD model. Furthermore, the components 
are coloured with different colors depending on the intensity 
of the force. treat patients in remote locations using telecom- 
munications technology 


X. GENERAL REQUIREMENTS FOR INDUSTRIAL 
AUGMENTED REALITY APPLICATION 


This paper strives to provide a structured overview on the 
general requirements towards AR applications in the industrial 
sector based on a literature review . Particular emphasis is 
placed on conditions and requirements that hinder applications 
and are specific to the industrial sector. The findings are vali- 
dated through two case studies from industrial applications that 
concern the application of AR to support mobile maintenance 
processes as well as the training of welders . 

The description of industrial AR applications aims to sup- 
port the deduction of general requirements. Therefore, a wide 
span of industrial applications is covered, but they are by no 
means exhaustive. Industrial AR applications are expected to 
perform well in the following areas: 


e Product design 

e Plant design 

e Training 

e Production assistance 
e Quality assurance 

e Production logistics 
e Remote maintenance 


A. General Requirements Towards AR Technologies in Indus- 
trial Applications 


This section covers general requirements of AR applications 
for the industrial sector. The description follows a cross section 
approach of industrial AR applications. The requirements are 
structured by dimension of time (development and integration, 
set-up, operation). 


e Cost-effectiveness 

e Data security 

e Applicable regulations 

e Set-up time 

e System reliability 

e Accuracy of presentation 
e Real-time capability 

e Ergonomy 


The presented requirements have been collected with a 
cross-application approach and show a rather low level of 
detail. This does not limit them to the industrial area, so that 
they instead may also apply for applications in other areas, 
such as flight simulators. 


XI. THE DESIGN AND IMPLEMENTATION OF AUGMENTED 
REALITY LEARNING SYSTEMS 


We present an augmented reality learning system that en- 
ables users to experience an interactive flower garden with the 
assistance of interactive agents in the augmented picture. We 
develop an interactive agent that generates problem-solving 
peer support to a user’s action. We overlay virtual flower gar- 
den over a physical book and offer a collaborative environment 
that allows a learner to interact with the agent. To evaluate 
the effectiveness of the proposed system, we implement it on 
a mobile device and enable users to experience the collabo- 
rative task with the animated character. Through evaluation, 
we found that the interactive agent could be a promising 
technology for motivating users to engage in learning systems. 
Augmented Reality Learning System 

We present an augmented reality learning system that pro- 
vides learners with opportunities to simulate flower gardening 
over a physical book. To enable the learners to experience 
learning environment directly, we offer them with an aug- 
mented scene, consisting of simulation factors, virtual flowers 
through their mobile devices with a camera. We allow users 
to explore environmental considerations of gardening with 
user interface. To improve learners’ engagements in gardening, 
we augment a picture with an interactive agent that assists 
users in achieving desired goals in the gardening environment. 
Specifically, it allows users to seamlessly interact with an 
interactive agent with their mobile devices. 
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Fig. 9. Interactive gardening environment 


XII. AN INTERACTIVE AGENT 


We develop an interactive agent based on theframework for 
designing interactive agents in interactive learning environ- 
ments . Thus, the agent has the ability to perceive changes 
by a user’s actions in the learning environment. It generates 
peer-like responses in accordance with the agent’s own beliefs, 
desires, and intentions . Consequently, the agent generates 
problem-solving advice with peer support in an autonomous 
way. 


XIII. AUGMENTED REALITY IN INDUSTRY 4.0 


Since the origins of Augmented Reality (AR), industry has 
always been one of its prominent application domains. The 
recent advances in both portable and wearable AR devices 
and the new challenges introduced by the fourth industrial 
revolution (renowned as industry 4.0) further enlarge the 
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applicability of AR to improve the productiveness and to 
enhance the user experience. This paper provides an overview 
on the most important application of AR regarding the industry 
domain. Key among the issues raised in this paper are the 
various application of AR that enhance the user’s ability to 
understand the movement of mobile robot, the movements of a 
robot arm and the forces applied by a robot. It is recommended 
that, in view of the rising need for both users and data privacy, 
technologies which compose basis for Industry 4.0 will need 
to change their own way of working to embrace data privacy. 


XIV. HUMAN-ROBOT COLLABORATION 


The fourth industrial revolution is bringing new technolog- 
ical challenges. The capability of industrial robots is steadily 
increasing, together with the expectation of stronger coopera- 
tive interaction. Operators need to work in a safe environment 
that enhances their trust in the robots. To create a system in 
which robots work side by side with humans, new interfaces 
must be developed to allow users to interact with them in the 
most natural way. For these reasons, new scientific disciplines 
are emerging. Human Robot Collaboration (HRC) is a new 
scientific discipline that tries to understand how to improve 
the human-robot collaboration using innovative interfaces. 
Creating a safe and trustworthy human-robot system is a 
complex challenge. 

AR, among other applications, is a promising technology 
that can enhance the user’s ability to understand: 

e The movements of a mobile robot 

e The movements of a robotic arm 

e The forces applied by a robot 


A. Mobile Robot Movement 


Industries often employ Automated Guided Vehicles (AGV), 
instead of using human skilled laborers, for material trans- 
portation. AGV are robots that can move independently and 
they are often used to transport equipment around a manufac- 
turing facility. Most of the time, an AGV follows a predefined 
path that on the one hand makes it easy for workers to be able 
to predict the robots intentions, on the other hand it imposes 
some limitations on the type of task the AGV can perform. The 
next generation of AGV will be capable of moving without 
following a predefined path and it will be able to decide, 
in real time, what is the best trajectory to follow in a given 


environment [54-61]. This behaviour introduces some degree 
of uncertainty and for this reason the communication of the 
vehicles intention must be as clear as possible. In fact, a way 
to improve the safety of these systems might be giving to the 
robot the ability to understand and predict human motion. 


B. Robot’s Arms Movement 


In a factory different tasks are accomplished, one of them 
is represented by the pick and place action or assembly 
procedure. These activities are usually performed by the so 
called “arm robots”, which are capable of grabbing objects and 
placing them in specific areas. As for the AGV’s movement, 
understanding in advance the path that will be taken by the 
robot’s arms is crucial to allow humans to predict the robot’s 
intention. 


C. Robot’s Force 


Monitoring the robots movement is useful to understand its 
intentions, but it lacks information about how much strength 
the robot is employing in performing its task. Mateo et al. 
developed an Android based application for programming 
industrial robots. The task can be monitored by overlapping 
real time information through AR. For example, while the 
robot is performing the task the force component of the tool 
center point in X, Y and Z 


XV. CONCLUSION 


Throughout our research we have gained better under- 
standing of augmented reality and it’s various Applications. 
Augmented Reality Will further blur the line between what’s 
real and what’s computer generated by enhancing what we 
see,hear,feel .Augmented reality has just began to show its 
capability’s and soon it will completely change how we per- 
ceive the world around us.While Augmented Reality will be of 
great use to us in the future, it has its drawbacks. Nevertheless, 
the quality of our lives has improved considerably as a result 
of Augmented Reality. In this seminar we were discussed 
about Augmented Reality and it’s applications like Augmented 
Reality in industry, Augmented Reality in dentistry, Aug- 
mented Reality in health and medicine, Augmented Reality 
in Learning, etc. like has various amazing applications that 
can very well allow us to live our lives more productively, 
more safely, and more informatively. We found this topic very 
interesting and look forward to researching it further.it has 
possibilities beyond our imagination and perception . 
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Abstract—Zigbee is a standards-based wireless technology 
developed to enable low-cost, low-power wireless machine-to- 
machine (M2M) and internet of things (IoT) networks. Zigbee 
supports much lower data rates and uses a mesh networking 
protocol to avoid hub devices and create a self-healing ar- 
chitecture. Zigbee technology is used to Model And simulate 
a wireless sensor network. Designing of hybrid topology by 
using three possible combinations of Zigbee routing schemes 
considered in different scenarios to certify the reliability of this 
communication. The result concludes that combination of mesh 
and star Topologies is better to make an effective hybrid topology. 

There provides centralized and automated billing system using 
RFID and Zigbee communication. Each product of shopping 
mall, supermarket will be provided with a RFID (radio frequency 
identification) tag to identify its type. In study of Zigbee tech- 
nonlogy,the overall Zigbee architecture is composed for a set of 
layers. Every layer gives a new set of services of the upper layer 
in the Network. 

Basically, the AODV the reactive protocol, is the routing 
protocol used to send the data from one node to any other. 
The smart plugs by pass Zigbee network and utilize packet loss 
ratio to transmit data and commands. Because the feature of 
low power consumption of Zigbee network. Zigbee network are 
utilized to detect parameters of each room. Coordinator of each 
room gets data which obtained by sensor and transmit data to 
PLC communication modem which is integrated in the smart 
plugs.The wireless environmental monitoring method for acqua 
culture based on zigbee technology , the hardware and software 
hardware design ,for the monitoring network and sense the nodes 
is also presented. Smart home can be implemented using Zigbee 
system has been considered as an advanced technology. In Zigbee, 
the method to increase the stability of the Zigbee signal, are by 
focusing on the shortcomings of Zigbee. 

Index Terms—Zigbee technology, hybrid topology construction, 
Zigbee cluster network, aquaculture monitoring, stabilization 
of Zigbee cluster network, automated billing system, smart 
home, parking using Zigbee, performance analysis on various 
environments. 


I. INTRODUCTION 


smart home applied widely because that it can bring 

convenience energy saving to us[1]. Zigbee is a wireless 
technology developed as an open global standard to address 
the unique needs of low cost, low-power wireless networks[2]. 
Zigbee networks are the widest utilized wireless sensor net- 
work (WSN) technology in smart home fields. Zigbee and 
wireless local area network (WLAN) both operate in 2.4GHz 


T= RAPID development of wireless communication 


Aparna S Balan 
Assistant Professor of Computer Applications 
Vidya Academy of Science & Technology 
Thrissur - 680501, Kerala 
(email: aparna@vidyaacademy.ac.in) 


A zigbee 


Wired connection ZigBee network 


Wireless connection 


| Existing 
} network 


band and their radio interference are inevitable[3]. There are 
some methods to reduce interference between Zigbee and 
WLANI4]. One of the methodsis the Packet Loss Ratio. 


Smart automation system is an adaptive system using noise 
level to control a noise frequency that is best suited for 
physically challenged people. Using voice commands to run 
the device is extremely simple. Noisy settings, a mobile robot 
device may actively identify the voice of a human interactive 
voice which is recorded using a wireless microphone. The 
voice command will be given to voice recognition board, 
then the voice is processed in the ARM7 processor and is 
transmitted to the receiver side by RF transmitter. Unlike 
pure Zigbee and smart home, the backbone networks of smart 
home based on Zigbee and smart plugs is the power line. 
The smart plugs bypass Zigbee network and utilize packet 
loss ratio to transmit data and commands. Zigbee network 
are utilized to detect parameters of each room. The method 
to increase the stability of the Zigbee signal, focusing on 
the shortcomings of Zigbee. It also presents Idle Channel 
Evaluation and Algorithm, Method of Detect and Judgement 
the Signal Strength. 

Zigbee devices often transmit data over longer distances 
by passing data through intermediate devices to reach more 
distance ones, creating a mesh network, which is a net- 
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work with no centralized control or high-power transmitter 
or receiver able to reach all the network devices. Here it 
provides centralized and automated billing system using RFID 
and Zigbee communication. Supermarket will be provided 
with a RFID (radio frequency identification) tag to identify 
its type. RFID tags may soon become the most pervasive 
microchip in history. The most basic RFID system consist 
of small transponders or tags attached to physical objects. 
When wirelessly interrogated by RFID transceivers or readers, 
tags respond with some identifying information that may be 
associated with arbitrary data records. Thus, RFID system are 
one type of automatic identification system, like optical bar 
codes. 

In supermarkets, there is a need to calculate how many 
products are sold and to generate the bill for each customer. 
Cashier’s desks are placed in a position to promote circulation. 
Many supermarket chains are attempting to further reduce 
labour costs by shifting to self-service check-out machines, 
where a single employee can oversee a group of four or five 
machines at once, assisting multiple customers at a time. It 
can provide an automatic billing to avoid queue in malls and 
supermarkets. 

Recently, the huge parking areas are built widely in the large 
commercial complexes, amusement parks, shopping malls 
and parking areas in expressways. In such parking spaces, 
it is difficult for drivers to find a vacant parking space. 
In this, we focus on the parking areas of large shopping 
mall. To solve those problems, the smart parking systems 
were investigated. The Zigbee terminals are equipped in the 
vehicles and parking spaces. The network is formed among 
the Zigbee terminals. The Zigbee is one of the sensor network 
systems. The Zigbee can connect more than 65000 devices 
in the network simultaneously. In this system, the shortest 
path from the entrance of the parking to a vacant space 
is derived. The information is transmitted to a driver in a 
vehicle using the Zigbee network. The system leads to efficient 
navigation to a vacant parking space near the entrance of 
the shopping mall. The large-scale, intensive, and open sea 
breeding development of aquaculture has led to a requirement 
for more accurate and timely culture control. Monitoring sites 
by manual methods may result in data loss. The conventional 
cable monitoring is also inconvenient. The development of 
wireless transmission can be used to solve those mentioned 
problems. This technology has the advantages of low cost, low 
power consumption, a distributed system, and ad hoc network 
capabilities. 

Zigbee Wireless Sensor Network Node for Aquaculture 
Monitoring includes the hardware and software hardware 
design of the nodes and the overall system design scheme. 
The overall system design contains RF Module Circuit of 
the Zigbee Node, Communication Module Circuit of Zig Bee 
Parent Node, Communication Module Circuit of Zigbee child 
Node. The Zigbee node software which contains Zigbee Com- 
munication Software for the Coordinator Device are detailed. 

Zigbee technology is also used to Model And simulate a 
wireless sensor network. Designing of hybrid topology by 


using three possible combination of Zigbee routing schemes 
considered in different scenarios to certify the reliability of this 
communication. Network the parameters Calling throughput, 
Delay, packet delivery ratio, network load is measured during 
these scenarios. The result concludes that combination of 
mesh and start apologies is better to make an effective hybrid 
topology. Recently, lots of attentions for its several superior 
features like rich functions in a small size of device, Effective 
data communication facility in short distance as well as a 
low power request and execute cost. Due to these excellent 
features, Zigbee has been far and wide. Applying in various 
areas as a good wireless network solution particularly were 
wired, network service is infeasible. Zigbee network can 
be configured in start re or mashed apology. The policy 
formation is an important issue in a wireless sensor network. 
Performance parameters such as energy consumption, network 
lifetime, data delivery, field coverage depends on the network 
topology. 

Zigbee a distinctive Communications Criterion principally 
to be deployed for wireless personal area network with low 
rate. When Zigbee cluster network generate more traffic, the 
performance of the network tends to decline due to lack of 
bandwidth utilisation. So, it proposes as a basic structure for 
Zigbee cluster tree network that dominate saying transit wait 
communication and is granted moment. The overall Zigbee 
architecture is composed for a set of layers. The most populous 
Zigbee topology can be communicated as the point-to-point 
low power and flexible routing. Zigbee Based Wireless Sensor 
Networks and Performance Analysis in Various Environments, 
it focuses particularly on analysis performance on the devel- 
oped prototype system. In analyzing the performance, quality 
of service (QoS) parameters of communication is investigated. 
These parameters, i.e., delay, throughput and packet loss, 
are investigated as a function of sensor node distance and 
transmitted packet size over line of sight (LOS) and non-line 
of sight (NLOS) conditions. 


II. AN EFFICIENT HYBRID TOPOLOGY CONSTRUCTION IN 
ZIGBEE SENSOR-NETWORK 


A. Communication Architecture 


The Zigbee protocol specifies a wireless technology based 
on the IEEE 802.15.4 standard for wireless personal area 
networks (WPANS[5]). The 802.15.4 is a standard that defines 
the Physical and Medium Access Control (MAC) for low 
power and low data rate wireless networks[6]. An 802.15.4 
network can work both in beacon-enabled or in non-beacon 
enabled mode. In the beacon enable mode[7], the network is 
controlled by a coordinator that provides synchronization and 
all resultson the basis of the above-mentioned scenarios the 
following result has been observed: 


B. Simulation 


The simulation process is done on the basis of three 
scenarios using OPNET modeler 14.5. In these three scenarios 
the possible combination of star, tree and mesh topologies 
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are considered and each scenario has twofixed Zigbee co- 
ordinators. Here evaluate the performance of these hybrid 
routing schemes under different network configuration. Three 
scenarios have been design, in which the nodes are placed 
randomly anywhere in the network[8]. For evaluation of the 
Zigbee network, the scenario describe as following,In first 
scenario, the combination of star and tree routing schemes are 
implemented in a network to design a new hybrid topology 
and investigate the performance of network number of nodes 
varies from 20 — 100[9].Second scenario includes the mesh and 
tree topology in combine form and investigation of network is 
similar to first scenario.Third scenario design a hybrid network 
topology by using star and mesh topology in a single network 
and simulation is done on the bases of above parameters. 


C. Results 


On the basis of the above-described scenarios the following 
result has been observed: 


e Throughput: 
Throughput is the ratio of the total amount of data that 
a receiver receives from a sender to a time it takes for 
receiverto get the last packet. Throughput is the data 
quantity transmitted correctly starting from the source to 
the destination within a specified time (seconds)[10]. 

e Network load: 
In Fig.1,It represents the total load (in bits/sec) submitted 
to 802.15.4 MAC by all higher layers in all WPAN 
nodes of the network [11]. Here we conclude that the 
combination of ST (star-tree) topology has the maximum 
networking load as compare to MT (mesh-tree) and SM 
(star-mesh) topologies. 
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Fig. 1. Throughput of hybrid topologies with 20 nodes. 


e Packet delivery ratio: 
In fig.2,It is the ratio of the number of data packets 
that were successfully delivered to end devices over the 
number of data packets that should be delivered. 

e Delay: 
The packet generated at the application layer may arrive 
at the MAClayer for transmission during the sleep in- 
terval. Since the physical layer does not transmit during 
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Fig. 2. PDR of hybrid topologies with varrying number of nodes 


the sleep interval, the packet is buffered for transmission 
in the next super frame duration. This process results in 
delay. 


III. PERFORMANCE IMPROVEMENT IN ZIGBEE CLUSTER 
TREE NETWORK 


In an Enhanced Distributed Adoptive Parent (EDAP) struc- 
ture occur used for Zigbee clusters tree topology[12]. The 
traffic load communication is changing direct spaced toward 
WSN communication. Disparate exchange packs minimize 
approach is adjusted estimate the genuine essential bandwidth 
over traffic stack cases of Zigbee cluster tree connection. 
Data collection is required for traffic load efficient, WSN 
communication without desecrating the principle of cluster 
tree network it gives complaint routing and expand bandwidth 
[3]. For the excited bandwidth it is necessary and EDAP 
bandwidth arranges network where required to convey future 
information. 


A. Wireless Cluster Tree Topology 


In the wireless cluster tree topology the router observed the 
nearby device as relevant cluster and star network. The Zigbee 
Sends the data by the GTS for the high delivery ratio. 


B. Adoptive Parent Model 


Under the EDAP framework the demand of the bandwidth 
is more so we requires extra routers to allow bandwidth which 
is demanded is known as adoptive parent or original parent. 


C. Traffic Variation 


Three layer positions are expanded in network topology. 
Coordinator which supports the routers and the equal traffic 
in routers the Zigbee coordinator receives data from the leaf 
nodes[15]. D .Dissimilar Exchange Pack In the Dissimilar 
Exchange Pack the actual essential capacity is planned de- 
pends on the traffic load Zigbee coordinator takes the data 
from the source and adjacent router Zigbee router is barred 
with bandwidth load and demand setting[16] E. Sensor Node 
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Distribution In the sensor node it done with Zigbee coordinator 
with the help of adptive parent with diverse bandwidth need of 
Zigbee router.Each node distributed to the another new parent 
which is set of ready new parent network. 


Parameter Quantity 


Channel type Channel/Wireless Channel 
Radio-propagation model Propagation/TwoRay Ground 


MAC Mac802_15_4 


Queue Drop Tail PriQueue 


Interface queue type 


Number of nodes 101 
Routing protocols AODV 


Typeofantenna Omni- direction antenna 


Fig. 3. Simulation parameters 


IV. NETWORK SIMULATION 


The Network Simulator NS-2 (ns-allinone-2.28) is being 
used to perform the simulation due its flexibility in simulating 
the variety of networks and ease of use. We have considered 
a network scenario consisting of 101 nodes. One is Zigbee 
coordinator (ZC), 71 nodes are taken as Zigbee Routers (ZR) 
and 29 nodes are taken as Sensors in an area of 80-meter 
square. An AODV routing method is incorporated to analyze 
the behavior of the system. Following are the simulation 
parameters deployed in the simulation. 


e Throughput: 
An amount of data transfer from one place to another 
place in the network. 

e Power consumption: 
It’s a saving power in the network, as long as possible in 
each packer to the total no of node in the network. 

e Packet Delivery Ratio: 
PDR is the ratio of total number of Packet send in the 
network. 


V. ZIGBEE BASED WIRELESS SENSOR NETWORKS AND 
PERFORMANCE ANALYSIS IN VARIOUS ENVIRONMENTS 


Here in Fig.5, illustartes illustrates the throughput as a 
function of packet size for different baud rates. The throughput 
is calculated as a number of packets sizes (bits) over required 
transmission times (seconds) to successfully receive in the 
receiver. The measurement results show that the throughput in- 
creases as the baud rate increases[15]. A maximum throughput 
of 19.2 kbps was achieved at baud rate of 115200 bps using 
highest packet size of 80 bytes. Although, this number is still 
far lower than the guaranteed data rate of 250 kbps for Zigbee 
protocol using Zigbee S2 modules[16]. 
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Fig. 4. Flowchart I 


The delay measurement results at different baud rates and 
packet size are presented in Fig. 6. It is observed that the 
packet requires more times to get receiver when lower baud 
rate is used. The longest delay of 125.283 ms was achieved 
during transmission using the lowest baud rate of 9600 bps at 
maximum payload of 80 bytes[17]. 

Fig. 8 shows the percentage of packet loss at four different 
location measurements[19]. As expected, that more packets 
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Fig. 5. Throughput measurements as a function of packet size at different 
baud rates. 
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Fig. 6. Packet delay measurement as a function of packet size at different 
baud rates. 


are lost during transmission as the sensor node moves further 
from the coordinator node with additional wall as well. Given 
the results in the figure, the packet loss increases significantly 
as the transmitted packet size increases[20]. It is observed that 
95 packets lost during transmission when we send 80 bytes 
packet from the most remote sensor node at location no. 4 to 
the coordinator. 

As shown in Table , the presence of the router (2 hops 
configuration) in the network gives significant effects on the 
data throughput as well as the packet delay.There is significant 
increasing of packet delay due to the additional processing 
time in the router. Similar thing happens in the throughput 
measurement. In the scenario of multi-hop, the throughput 
decreases as the packets need more time to get the receiver. 
Given the results from these experiments, it is observed that 
the performance of multi-hop configuration slightly decreases 
as a compensation for having more coverage area. 


VI. DESIGN OF A ZIGBEE SENSOR NETWORK NODE FOR 
AQUACULTURE MONITORING 


The wireless environmental monitoring method for aqua- 
culture based on Zigbee technology, and the hardware and 
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Fig. 7. Packet loss percentage at different packet size in four measurement 
locations. 


Packet size Throughput (kbps) Delay (ms) 
(bytes) 1 Hop 2 Hops 1 Hop 2 Hops 
10 2.196 1.230 36.896 70.446 
20 2.537 1.735 63.150 120.638 
30 3.037 2.269 79.100 125.004 
40 4.031 2.975 79.483 135.725 
50 4.460 3.077 89.746 168.738 
60 4.770 3.581 100.692 154.100 
70 4.985 3.858 112.425 160.117 
80 5.227 4.262 122.488 180.400 


Fig. 8. Throughput and packet delay measurement for single and multihop 
configuration. 


software hardware design for the aquaculture monitoring and 
sensor node is presented. The essential part of a control 
program is the main program, which is where the program is 
initiated and the main functions are realized. After the terminal 
device is powered on, a series of initialization processes are 
carried out. These initializations mainly include the initial- 
ization of the target board hardware, the peripheral interface 
initialization, and the initialization of the related variables. 
Then, the program enters an endless loop to continuously 
detect which events are triggered. Once an event is triggered, 
the corresponding subprogram is called to implement the 
appropriate processing. The main program flow of the terminal 
device control software is shown here. 


VII. AUTOMATIC STABILIZATION OF ZIGBEE NETWORK 


The wireless environmental monitoring method for aqua- 
culture based on Zigbee technology, and the hardware and 
software hardware design for the aquaculture monitoring and 
sensor node is presented.The essential part of a control pro- 
gram is the main program, which is where the program is 
initiated and the main functions are realized. After the terminal 
device is powered on, a series of initialization processes are 
carried out. These initializations mainly include the initial- 
ization of the target board hardware, the peripheral interface 
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Fig. 9. Flowchart II 


initialization, and the initialization of the related variables. 
Then, the program enters an endless loop to continuously 
detect which events are triggered. Once an event is triggered 
, the corresponding subprogram is called to implement the 
appropriate processing. The main program flow of the terminal 
device control software is shown here. 


igBee ata Handie 


Fig. 10. Flowchart III 


Zigbee signal transmission is two-way communication ac- 
cording to model of server and client. Zigbee gateway config- 
ured with Zigbee coordinator acts as server and Zigbee sub- 


device acts as client. Usually, Zigbee gateway send message of 
query and control to Zigbee sub-device and Zigbee sub-device 
return message of status to Zigbee gateway. For explanation, 
Zigbee sub-device send message of status to Zigbee coordina- 
tor which configured in the Zigbee gateway as shown here. 

The Signal sent from Zigbee sub-device received by the 
Zigbee coordinator A and according to Zigbee communication 
protocol, goes on to transfer to the upper layer of Zigbee 
Gateway A which is the application layer in the same hardware 
system. Following the Zigbee protocol of application layer, 
the Zigbee Gateway A will send the data of status to process 
of Zigbee Data Handler which is the main logic of data 
of Zigbee sub-device status handling. The thread of strength 
judgement be waked up when the signal strength calculated 
and stored. Thread of strength judgement will get the result 
of signal strength from flash or RAM using synchronous and 
mutex locker which are used to avoid deadlocking of the two 
processes (signal strength calculation and strength judgement) 
visiting the same memory address or critical resource. 

After obtaining the strength of signal from flash or RAM, it 
goes to the judgement logic: if PR(d) < PR(a) or not. PR(d) 
is that signal intensity of transmission between Zigbee sub- 
device and Zigbee gateway. If PR(d) < PR(a), means that 
the signal strength between the two Zigbee devices (Zigbee 
sub-device and Zigbee gateway) is poor. The best choice for 
this problem is to add another Zigbee gateway between the 
two original devices to shorten the distance of device signal 
transfer. At this moment, warning message will be prompt up 
to remind user to add reply (another gateway) between the two 
devices according to Zigbee networking protocol. If PR(d) ¿= 
PR(a), means that the signal strength between the two Zigbee 
devices (Zigbee sub-device and Zigbee gateway) is fine. No 
need add reply between them. 


VIII. CONSTRUCTION OF SMART HOME BASE ON ZIGBEE 
AND SMART PLUGS 


A. Features of Zigbee PLC Technology 


1) Zigbee: It can be built a network more flexibly and 
convenient. Zigbee with low power consumption ,is the best 
choice of end devices of Internet of Things. Zigbee network 
with a short transmission distance indoors, is interfered by 
WLAN, microwave and Bluetooth seriously. 

2) PLC: utilize power lines to communication, makes 
wiring is unnecessary. PLC don’t need extra power source. 
PLC with high transmission rate, can be used to transmit 
picture and video. Which increase range of application of 
smarthome. Zigbee network just responsible for collecting 
environmental parameters. Data or command saresentto PLC 
communication module via the RS485/232 conversion module 
by upper computer. PLC communication module modulates 
the data or commands which are transmitted through power 
lines to electrical signals .In the smart plug, a PLC communi- 
cation module demodulates data and commands, send it to the 
Microprogrammed Control Unit (MCU) or Zigbee coordinator 
via RS485/232 or RS232/422 conversion module respectively. 
Finally,Zigbee coordinator send data or commands to the 
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Zigbee end device. MCU send data or commands to other 
functional modules. Each smart plug is equipped with a 
PLC communication transformer. and each room is equipped 
with a Zigbee coordinator and some Zigbee sensors. Each 
room is an isolate Zigbee network. Zigbee coordinator, MCU, 
PLC communication module and other functional modules 
are powered by the AC/DC power source. Beside batteries 
of Zigbee sensors, additional power source is unnecessary. 
Current transformer can monitor the power consumption of 
each home appliances ,it is used by smart home energy 
management system (HEMS). 


En 
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Fig. 11. Features 


Functional modules are embedded in the smart plug, smart 
plugs can not only turn on or turn off appliances, but also 
can adjust appliances to work at high efficiency .As long as 
infrared remote-control module is embedded in the smart plug. 
We can locally or remotely use upper computer to control 
infrared module to turn on or turn off TV, air condition 
and other in frared appliances .The Zigbee luminous sensors 
and air quality sensors of indoors and outdoors compare 
parameters of indoors with outdoors. If illuminance out doors 
is stronger than in doors in day time ,electric curtains will 
be opened by infrared module. If air quality outdoors is 
better than indoors, the upper computer reminds user to open 
windows, or the air purifier will be turnedon. 


IX. CENTRAL AUTOMATED BILLING SYSTEM 
A. Hardware Implementation and Design of Cart 


This consists of a microcontroller, display unit (LCD), an 
EEPROM, RFID reader, Zigbee transceiver and a battery 
power source. The battery power source increases the mobility 
of the device. 

1) Microcontroller: 

The AT89S52 is a low-power, high-performance CMOS 
8- bit microcontroller with 8K bytes of in-system pro- 
grammable Flash memory. The device is manufactured 


using Atmel’s high-density non volatile memory technol- 
ogy and n is compatible with the industry standard 80C5 1 
instruction set and pin out. The on-chip Flash allows the 
program memory to be. 
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Fig. 12. Schematic diagram 


2) Zigbee: 
Zigbee is expected to provide low cost and low power 
connectivity for equipment that needs battery life as long 
as several months to several years but does not require 
data transfer rates as high as those enabled by Bluetooth 

3) EEPROM: 
The AT24C02 provides 2048 bits of serial electrically 
erasable and programmable read-only memory (EEP- 
ROM) organized as 256 words of 8 bits each. The device 
is optimized for use in many industrial and commercial 
applications where low-power and low-voltage operation 
are essential. 

4) RFID Reader: 
Radio-frequency identification (RFID) is a technology 
to electronically record the presence of an object using 
radio signals. It is used for inventory control or timing 
sporting events. RFID is not a replacement for the bar- 
coding, but a complement for distant reading of codes. 
The technology is used for automatically identifying a 
person, a package or an item. 
Each cart is attached with product identification device 
(PID), through Zigbee communication PID sends its 
information to central automated billing system, there it 
calculates net price for the purchased products. Customer 
can get their billing information at the packing section ac- 
cording to their Cart Identification Number. Even there’s 
is no need for a cash collector, in case customer uses their 
debit/credit for bill payment. 


X. MODEL OF PARKING AREA 


The model of the parking area used in the simulations is 
shown in this parking model is the actual shopping the Zigbee 
terminal is categorized as the router and the end device. As 


Anitta V J et al, “A Study on Zigbee Technology” 


120 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur - 680501 


My 
bd | rara 


Ja 


x in 


A 
JENDE 


o ZigBee in car 


<I 


oO ZigBee Router o ZigBee End Device A ZigBee Coordinator 


Fig. 13. The model of the parking area 
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Fig. 14. Schematic diagram 


the routers are installed at the paths of the parking every 10m 
and in vehicles. The routers communicate with the all types 
of the Zigbee devices. The Zigbee end devices to detect the 
occupation of the parking space are placed at each parking 
space. The Zigbee end device can communicate only with the 
Zigbee routers. The Zigbee coordinator is set at the entrance of 
the parking area. The coordinator has the same communication 
function. It collects the information of the vacant parking 
space. It is assumed that the communication range of Zigbee is 
about 30m. Each Zigbee devices can communicate with other 
Zigbee devices in its communication range. 


In this study, the percentage of the vacant parking spaces 
and the interval of entering vehicle are examined to show the 
effectiveness of the proposed parking system, the communi- 
cation procedure of the Zigbee network is shown in Figure 
2. When the vehicle that equips the Zigbee devices is arrived 
at the entrance of parking, the Zigbee coordinator that set at 
the entrance of the parking area recognizes the vehicle. The 
Zigbee device in vehicle adds to the Zigbee network of the 
parking area. The information is transmitted to the Zigbee 
coordinator that set at the entrance of the parking. After vehicle 
receiving the vacant space information, the acknowledgement 
(ACK) packet is transmitted to the vacant space. The ACK 
is transmitted in order to reserve a vacant parking space. The 
status of the parking occupied information is updated every 
one second. The vehicle runs to the vacant space according 
to the received information. The navigation is ended when the 
vehicle reaches the parking space. After the vehicle reach to 
the parking space, the driver walks to entrance of the shopping 
mall.When the trouble occurs like that vehicle go the other 
(wrong) way, the new route is provided from the point to the 
vacant space as soon as possible. 


A. Performance Evaluation 


In this simulation, the communication range of the Zigbee 
is set as 30m. It is assumed that the communication is perfect 
when the distance between the Zigbee devices is less than the 
communication range. Whereas, when the distance is more 
than the communication range, the Zigbee do not communicate 
at all. 
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Fig. 15. Flowchart IV 
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Fig. 16. Schematic diagram 


e Average time for arriving by changing rate of vacant 
parking space. 

e The average time for arriving at the entrance of the 
shopping mall from the entrance of the parking area is 
evaluated when the interval that the vehicles enter the 
parking area is changed. The parentage in parenthesis 
means the ratio of the vacant parking space. 


XI. IMPLEMENTATION OF COMMUNICATION AID USING 
ZIGBEE TECHNOLOGY 
A. Tranceiver 
Both the transmitter and receiver combined in the 
transceiver communication aid module. 
B. Transmitter 


The transmitter mainly consists of speech recognition board, 
ARM.-7 processor and RF transmitter. The speech recognition 


Fig. 17. Schematic diagram 


board we are using is HM 2007, with 4*3 matrix keypad to 
train the voice command [15-16]. Voice input is given through 
mic.8kb external static RAM is required to store data, as HM 
2007 itself cannot store thus it needs an external static RAM.3v 
of battery backup is used for this RAM.Up to20 words each 
of duration 1.92 seconds can be stored. RF transmitter is 
used in this section, in which HT-12E is used to recognize 
and encode the voice speech in the transmitter so that it can 
transmit the signal to the receiver part. In this section ARM 
7 based LPC 2148 Microcontroller is used for low power 
consumption which makes them particularly suitable for use in 
portable devices. On-chip static A Memory ranges from 8 to 
40 kilobytes, with 32 to 512kilobytes of on-chip flash program 
memory. 

The receiver part in this section is mobile whereas the 
transmitter part is static one. The receiver part comprises of RF 
Receiver, voice playback and recorder and ARM 7 based LPC 
2148 microcontroller [17]. The signal from the transmitter 
side is received by the RF Receiver, which decodes it on the 
receiver side. These signals then enable the robot to move 
using a DC motor. 


Fig. 18. Devices 


XII. A PROJECT DESIGN METHODOLOGY 


Components which are mainly used in the system are speech 
recognition board, LPC2148 based ARM7 microprocessor, RF 


Anitta V J et al, “A Study on Zigbee Technology” 


122 


Proceedings of Vidya MCA Departmental Seminar (VMCADS - 2021), 22 - 23 November 2021 


Vidya Academy of Science & Technology, Thrissur - 680501 


trans receiver, voice playback and recorder. In the transmit- 
ter section, speech recognition board, ARM7 board and RF 
transmitter are connected together, and in the receiver section 
RF receiver, voice playback and recorder and ARM7 board 
attached with a LCD display are connected together.At first 
voice command is given to the voice recognition board. A 4*3 
matrix keypad is mounted on the speech recognition board, of 
which two buttons are TRAIN and CLR (clear).First of all, 
unwanted noises are cleared and voice is given to the board 
and is trained by taping the TRAIN button by the user. If 
77 is displayed in the 7-segment cathode display, error will 
be there, 55 denotes long word and 99 shows no match. After 
training the voice, it is then process in the ARM7 board. After 
processing the voice, the data is transferred to the receiver side 
by the RF transmitter side. 


In the receiver side Radio Frequency receiver receives the 
data and again process it in the receiver part. Again, voice 
input is given to the recorder. By tapping the button in the 
voice playback and recorder board, voice input is given and 
recorded by the recorder. The work of this board is to record 
the voice and play it back later onwards. ARM7 board restores 
the data and the robot starts to move in left, right, front, 
back direction according to the input voice command which 
is assigned to the direction of the robot. 


1) Zigbee Module: 
The Zigbee module includes wireless lighting switches, 
electric meters with home display, and other customer 
low-rate data transfer, traffic controlling systems and the 
industrial equipment that requires short range application. 
This is a high-level message transfer protocol which 
issued to generate PAN with very small low power numer- 
ical radios like in home automation, other low power low 
bandwidth requirements, and the design for small scale 
projects that require wireless connection Health service 
data collection. 

2) ARM7 Board: 
An arm7Board is considered one of a family of CPUs 
established at the Reduced Instruction Set Computer 
architecture developed the means of Advanced Instruction 
Set Computer Machines (ARM7). It has 64 and32 bit 
implementation states used for scalable extreme perfor- 
mance. It is far load keep architecture; it takes an or- 
thogonal practice set on the whole single cycle execution 
enhanced with electricity saving design. 

3) LPC2148: 
The LPC2148 is a popular Integrated Circuit from the 
ARM7 architecture. It is developed by Philips and comes 
with a range of built-in peripherals that make it more 
powerful. It is a dependable choice for both starting and 
advanced application developers. LPC2148 has a 128-bit 
broad interface that allows for fast speed operation at 60 
megahertz. 

4) Voice Recognition Board: 
The Voice Recognition Board is a system for using the 
voice recognition module in a normal and easy way. It 
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Fig. 19. Flowchart V 


eliminates and analyses human voice features delivered 
via microphone to a machine or device. If the degree 
of voice credibility is greater than 95%, simple voice 
popularity is almost always used. If the degree of voice 
credibility is greater than 95%, simple voice popularity 
is almost always used. 


XII. CONCLUSION 


Zigbee is one of the global standards of communication 
protocol which is formulated by the relevant task force. It is 
the newest and provides specifications for devices which have 
low data rates, consume very low power thus, characterized by 
long battery life.Zigbee Technology posess oddly outstanding 
characteristics as compared to other wireless technologies.As 
it follows mesh topolgy,which has robust features to beat any 
situation rather than that it can also provide high privacy and 
security.Zigbee is the most popular industry wireless mesh 
networking standard for connecting sensors, instrumentation 
and control systems. 
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